5 Ways to Find Interesting Data Sets
- Buzzfeed’s Data is Plural for a no fuss list of interesting datasets
- Best in Visual Storytelling for the compelling stories told with data
- Open Data Institute’s This Week in Data to keep a pulse on the open data community at large
- Enigma’s newsletter, Between Two Rows
Last month’s Between Two Rows data visualization on Migrant and Seasonal Agricultural Worker Protection Act data 2. Keep up with media that make use of data From Bloomberg’s video game on the Demise of the American Shopping Mall to ProPublica’s release of Trump’s White House Visitor Records, cutting-edge media institutions have long used open data for meaningful storytelling. In fact, the New York Times launched a series called What’s Going On in This Graph? to better educate their readers on their data visualizations. These articles are a great place to see what can be done with data and to investigate their open data sources. The Washington Post’s visualization on the middle of nowhere using data from the Malaria Atlas Project, the Census Bureau and NASA. 3. Listen to prominent voices in the open data space Not only is the practice of data science evolving, more data is getting released every day. Data advocacy groups like the Sunlight Foundation, Open Knowledge Foundation, Opencorporates, and the Open Data Institute are active in shaping the open data space. These organizations often showcase exemplary open data sets, and where transparency is lacking, put pressure on governments to improve. By following their work, you’ll be the first to learn about newly open data sets. 4. Request data that’s never seen the light The Freedom of Information Act (FOIA) allows the public to request government agency documents and other data. Requesting data via a FOIA will almost guarantee you data that has never been analysed (although it is often the ultimate test of patience). To figure out what kind of data you want to ask for from federal or state government agencies, take a peek at FOIA advocacy group, MuckRock. Quick tip for those new to FOIA: be as specific as possible. Request the exact name of the file you want (if you know it!), the format you’d prefer it in and the date ranges you’re interested in. The more specific the request, the more likely you are to get data in return. Enigma Public’s FOIA correspondence with the Internal Revenue Service. 5. Use metadata to your advantage A data set accompanied by a data dictionary, or a related set of metadata describing the contents of the data set, says that the source is serious about their data game. I often investigate other data sets released by the same source, safe in the knowledge that they hold their data sets to a high standard. I am consistently impressed by the team behind the NYC open data portal who often provide data dictionaries in addition to the name of the data set owner, the agency that releases the data and its update frequency. While these tricks have helped me unearth some true data treasures, I’m always on the hunt for other sources of inspiration. If you had any additional advice, do send it my way.
I work for Enigma Public, the world’s broadest collection of public data.