Written by Rachel Oberman and Seth Goodman
Statistics By Rachel Oberman
Maps By Leigh Seitz, John Napoli, Josh Panganiban, Graham Melville, Grace Grimsley, Lauren Hobbs
Brave explorer and want to jump straight to the data? Check out our early alpha soft launch at: http://geoquery.org/geoboundaries/.
Every Sunday, nine students pile into a tiny lab in Williamsburg, Virginia and begin working. One may be emailing the Venezuelan government, another taking a phone call with a data scientist in Estonia, and others analyzing the data licensing for an organization in Madagascar. These students are part of AidData’s GeoBoundaries team – a group dedicated to collecting accurate, open source data on administrative boundaries around the globe.
Students working on GeoBoundaries come from a variety of majors – including international relations, data science, and biology – and have experience ranging from field work in developing nations to internships working with the United Nations. What they all have in common is a desire to make spatial data more accessible and easier to use for everyone – whether they are a data scientist using machine learning to analyze terabytes of spatial data, a project manager mapping new health clinics, or a researcher exploring conflict trends across Africa.
A key lesson these students are learning, and what drives the need for GeoBoundaries, is the importance of open source data. To date, few collections of global administrative boundary data exist. Those that do are either outdated and infrequently updated, or not freely redistributable (or both). Even finding accurate and open source boundary data for individual countries can be difficult. Some countries have well maintained, open source data portals with the latest data easily accessible, while others require extensive searching only to find that the best options are years old and cannot be redistributed.
As spatial data continues to expand and play a growing role in how we understand and shape our decisions about the world around us, the need for reliable and open source data will be paramount. By removing the burden of finding and making sense of critical administrative boundary data, AidData and the GeoBoundaries team hope to empower a broad range of data users across disciplines to produce new and meaningful research and insights.
As of this week, the GeoBoundaries team has collected over 600 sets of administrative boundaries from around the world, with nearly complete coverage at the ADM0 and ADM1 levels, and substantial coverage at the ADM2 level. Efforts have been primarily focused on compiling as complete a collection as possible up the ADM2 level before fully focusing efforts on finer scale data (ADM3+), though work is already well underway for Africa as it has been a considerable focus for ongoing research at AidData as well as for work done by many other researchers.
Having formed initially out of a need to support work at AidData, GeoBoundaries focuses heavily on developing nations. Relative to finding data for developed nations, this can be a huge challenge. After exhausting all avenues of searching for data online using existing and open sources, the team typically expands their search by reaching out to governments, NGOs, and other international organizations that may have collected or have access to boundary data. After finding and assessing a new set of boundary data, along with securing permission to make the data freely available, the team repackages it into a standardized GeoBoundaries format.
The result of hundreds of phone calls, thousands of emails, and countless hours scouring the web for files, is a collection of boundary data for Africa consisting of 53 ADM0, 52 ADM1, 45 ADM2, and 16 ADM4 boundaries for 53 countries in Africa.
This collections of open boundary data is a critical component of advancing research and development efforts related to low and middle income countries where reliable and open data can be hard to come by. Just as importantly, users can reallocate the time they would have spent searching for boundary data to actually doing their research.
GeoBoundaries will help users overcome the traditional difficulties of working with boundary data by compiling a free, redistributable, and fully open source set of administrative boundary datasets for every nation in the world. The team uses a wide variety of sources to gain the most accurate geospatial data for each country at the most granular level possible. The GeoBoundaries database will also allow anyone to take advantage of the significant number of outcome and ancillary geospatial datasets provided through GeoQuery, which will be incorporating the GeoBoundaries data as its primary source of boundary data later this year.