awst.in

Open Data

November 19, 2018


DRAFT


data.austintexas.gov

I don’t know exactly what year cities started making data available to the public, but I first became aware of the effort around 2014. The effort to make raw data available to anyone who is interested is a great idea, but it takes a certain level of motivation to start to understand what you’re actually seeing. I think that for things like hackathons, events where, for the most part, people are learning to program, having a ready-to-go data source makes things so much easier.

There are utilities for generating visualizions, entire sets can be downloaded. It’s the sort of access you’d expect an open government that’s focused on its citizens to provide.

A quick look at the most access data sets is probably reflective of programmatic access more than anything else. It doesn’t seem like there’s a correlation with the sets that you’d think people would find most interesting. Then again, it’s hard to say what matters to people in this age of detached cynical voyerism. Do people care about whether pedestrians are killed because there’s no sidewalk in their area? Or only that traffic is delayed?

A lot of activity seems to have occured in 2015, but many of the public projects seem to have stagnated.

It should be noted that there are also for-profit players in this space, such as https://data.world

A quick thank you

Socrata seems to have made a business around hosting data and providing access for these sorts of civic engagement efforts.

Vickie O’Dell who has been the public face of a lot of activity in this space, at least, for me.

Understanding Data

I need to sit with a set for a while before I can reason about it. It helps to understand distributions, highs and lows, and, ideally, something about the relationship of the data to itself, as well as the real world. It is overwhelming to think of all of the possibilities. And then finding the combination of what you can actually show and what is interesting is headache inducing. Maybe I’ll eventually hit the sweet spot.

There are plenty of tools and guides out there to help the novice coder muck their way through this kind of data. Most of the teams I’ve worked with use Python or R.

Wes McKinney (creator of Pandas) wrote a great introduction text that covers a lot of the nuts-and-bolts of data anaylysis. Python for Data Analysis

Why Build This?

I started building this because I am interested in static websites, or “serverless client-side apps”. Over the years they’ve developed a much nicer name — Progressive Web Apps. I was also heavily inspired by some NYT interactive data graphics around the 2012 election and wanted a chance to play with D3, a technology that grew up in their shop.

There are already a lot of tools for data collaboration and data visualization. I wanted to play with a few specific technologies and retain the flexibility to try out different stuff. That is to say, my primary motivation isn’t data journalism or research.

If I were strictly a researcher, I would likely have reached for something like:

  • Jupyter Notebooks
  • Tableau
  • Observable

and if I were a data journalist I might have used the above, or:

I don’t really know a lot about this space but most of the majors seem to be on some form of customized Django.

Resources

Texas

National

Global


EB

By EB in Austin, Texas.