Revolution #14: New Ways to Get Our Data “in the raw”

There are three requests we hear over and over again from researchers:
(1) Can I use my own custom boundaries?
(2) Can I have the underlying raster data you’re using?
(3) Can I visualize the data?

For the past two years, the answer to these questions has been “No”, and “maybe someday”. So, I’m very glad to answer that we can now say “Yes!” to #1, and #2/3 are going to be coming down the pipeline very soon.

For those of you that are interested in the technical gnitty-gritty (i.e., why this has taken us so long), here’s a brief dive:

Custom Boundaries

While we don’t have a perfect solution in place for these, behind-the-scenes we now have easy to use scripts that allow us to quickly run custom requests for just about any dataset.  So, today we unveiled a new form that allows anyone to request custom boundary data from us.  While we hope to have this built directly into GeoQuery in the future (we’re grantwriting like crazy!), as of right now you can go and request your own custom extracts here:

Custom Data Extraction Request

Underlying Raster Data

This one has been a long time coming, but we’re nearly at the end of the tunnel.  While we have terrabytes of satellite and other raster data behind-the-scenes, researchers that want to use that raw data still had to go to around forty different websites, process the data, get it into a similar format.. etc.   So, for users that are GIS saavy and really want to get their hands on the raw raster data, we’re opening up our back-end repository.  It ain’t pretty.  And, it’s really only for experts.  But if you want the data, it will be your to have very soon.  Look for a new blog post on this soon.

Vector Boundary Data and Visualization

For a long time we’ve relied on GADM (www.gadm.org) for our Geographic boundaries, which has inhibited us from redistributing our boundary files (due to the undefined / unclear license of GADM).  So, we did it ourselves.  Coming in the next few weeks (!) will be the first release of the AidData GeoBoundaries dataset – a dataset of all administrative units around the world, fully open and redistributable.  We’ve spent the better part of a year begging and borrowing, getting permission, and otherwise putting this dataset together, so we’re very excited to let this one out.

For GeoQuery in particular, this opens up new potential options for our visualizations: previously, we could not build in any visualization options as we didn’t have the rights to re-use the boundaries (or let users download visualizations they created).  No longer!  Once we get GeoBoundaries integrated into GeoQuery, we’ll be pushing hard to find a quick way for anyone to be able to visualize the datasets they create.

 

As always, more coming soon…

About the author: Daniel Runfola

Dan's research focuses on the use of quantitative modeling techniques to explain the conditions under which aid interventions succeed or fail using spatial data. He specializes in computational geography, machine learning, quasi-observational experimental analyses, human-int data collection methods, and high performance computing approaches. His research has been supported by the World Bank, USAID, Global Environmental Facility, and a number of private foundations and donors. He has published 34 books, articles, and technical reports in support of his research, and is the Project Lead of the GeoQuery project. At William and Mary, Dr. Runfola serves as the director of the Data Science Program, and advises students in the Applied Science Computational Geography Ph.D. program.