Revolution #15: Code Spaghetti

I happened across a nice graphic over on that helpfully illustrates the types of development processes we use for GeoQuery. Enter “how much engineering does my software need”:

Different elements of GeoQuery are engineered to different standards depending on two key factors: (1) if we believe it will be a long-term element of the program, and (2) the nature of the code. For example, changing the way that boundary searches occur within the front-end is a very quick activity, so we tend not to worry too much about “doing it right” (plus, if we get it wrong the core goals of the software – getting users good data – aren’t critically injured). On the other side, the extraction scripts we use go through round after round of engineering, ensuring both academic and technical rigor; these scripts may not be as complex as enterprise-grade software, but it’s critical to our mission we get it right.

One of the things that slows down new feature development is the importance of the long-term sustainability of GeoQuery. A great example of this is the ability to upload custom user boundaries – technically, we could implement such a feature tomorrow; it would be a mess of code that likely would break in the future, and inhibit future functionality. Doing it right is hard; doing it quick is possible, but would inhibit us significantly in the future (and, if we did it wrong, potentially break the whole system!).

The short of it: we’re always balancing the importance of new feature requests with the challenge of engineering a system that is sustainable in the long-term.  We wish we could just throw quick hacks at every problem, but with research replicability representing a core goal, we have to be very careful about implementing any code that might risk our sustainability – even if we want it as well!

About the author: Daniel Runfola

Dan's research focuses on the use of quantitative modeling techniques to explain the conditions under which aid interventions succeed or fail using spatial data. He specializes in computational geography, machine learning, quasi-observational experimental analyses, human-int data collection methods, and high performance computing approaches. His research has been supported by the World Bank, USAID, Global Environmental Facility, and a number of private foundations and donors. He has published 34 books, articles, and technical reports in support of his research, and is the Project Lead of the GeoQuery project. At William and Mary, Dr. Runfola serves as the director of the Data Science Program, and advises students in the Applied Science Computational Geography Ph.D. program.