When teaching local journalists about data reporting, many have expressed an interest in what you can do with data, but they resoundingly have the same complaint: "I don't have time for this."
I don't blame them. Journalists are under more pressure than ever these days to multitask and fill stories by the deadline. And data journalism, while rich in returns, seems to require a big learning curve.
But what if I told you there are some things you can do today that will build up to a data story tomorrow? And these things don't require any skills or knowledge about data whatsoever?
Set up some of these today, and you'll find a lot of data story options coming down the road.
Klaxon is a tool from The Marshall Project to help journalists monitor websites and track changes. It notifies you if it notices that a website has changed:
It's fairly simple to set up; simply go to the website you want and use the Klaxon bookmark to start tracking. Ask Steve how you can get a login.
Census Reporter is a news-outlet-created tool that helps journalists easily review and visualize Census data. It has built-in comparisons with other geographies (like a town to its county) and even some mapping tools. You can search for any geography level, like Edison, NJ:
You can also search by topic, then create tables to compare any geography to another. While the Census site is still a good goal to learn, Census Reporter will help you in most cases.
13. Go to Steve's data warehouse and Data Is Plural's spreadsheet of data.
... And Ctrl+F your subject or topic of interest.
While sources of data are everywhere around the web, it can be hard to find a comprehensive spot to get started browsing datasets. Steve's data warehouse is NJ-focused, while DIP has everything from libraries to Game of Thrones. The ProPublica Data Store, Enigma Public, Kaggle and Github accounts of news outlets are also good places to look.
Many larger cities and NJ agencies have their own data portal or downloads center. Data.nj.gov and https://Data.gov are two of the biggest government repositories of data, where you can find pension data and
If this, then that is a phone application and website that lets you create "recipes" that combine different services into one task at a time. You can create alerts to tell you when someone's tweeting, posting, or using an RSS feed, sort out your own messages and -- and this is my favorite -- log activity to a Google spreadsheet:
This might be a bit tricky if you're limited to admin-free accounts, but it might be worth the time with IT. Tabula is a software to convert PDFs to downloadable Excel spreadsheets. While tools like Cometdocs also offer this feature, Tabula is the easiest and cheapest (free!) option.
Records requests are the bread and butter of exclusive content. We at the data team file them regularly, even if we don't yet have a particular story in mind.
- If you're looking for some inspiration, IRE has a tipsheet of 50 things you can FOIA right now.
- Muckrock is a helpful tool for writing and tracking records requests, but an OPRA folder and a little practice also works.
Propublica has a number of news applications to assist reporters in doing their job. My favorites include
- Dollars for Docs, which has doctor's payments from drug companies
- Treatment tracker, what meds has that doctor been giving out?
- Nonprofit explorer, look up charities, universities, health systems and more
- Has your school been investigated for civil rights violations?
- The FEC itemizer and Represent for looking up your Congressional representative
- HUD Inspect to find out how your local housing authority maintains its affordable housing.
- Miseducation - is there racial inequality at your school? Look up local schools and districts to find their racial makeup and how students of different races perform on key metrics.
Data journalists like to talk to each other, especially when an Excel problem is involved. NICAR is one of the biggest resources we have to chat and share our ideas. IRE/NICAR has video tutorials, conference tipsheets and audio, and the NICAR-L listserv where data journos ask questions and share projects throughout the year. The IRE 2018 conference has tipsheets designed for daily journalists to get into data journalism.
If you're looking to get some inspiration, I've collected a number of favorite places to see the latest data visualizations and big projects. These may not all be available to the local reporter, but they can help you see what's in the realm of possibility.
- My Twitter list (yes, shameless plug) of data, data viz and investigative outlets
- Data Is Plural for free datasets
- Rachel Schallom's Best in Visual Storytelling
- Ponyter's Try This! Tools for Journalism
- 1801 newsletter
- The OpenNews community list. Open News in general is another great resource for data journalists.
- Sophie Warnes' Fair Warning
WordCounter lets you easily plug text data into their tool and find out what the most common words and phrases are. It can help you to find patterns or attitudes from a large amount of text, whether it's a speech, report or some song lyrics.
DataBasic has several other tools to help reporters easily find data in spreadsheets, like:
- WTF CSV, plug in your CSV and find out the frequency of different categories in the data;
- SameDiff, helps you to figure out how similar or different two chunks of text are;
- ConnectTheDots, create basic network graphs between people or communities.
I could recommend a million Excel tutorials to get you started -- my own is not the worst, and comes with a Datawrapper sequel -- but the first thing to know about data is not formulas or pivot tables. It's the care and keeping of clean data.
Check out Sandhya Kambhampati's brief tutorial on creating databases for when you're creating or logging your own data. If you're trying to make heads or tails of data from a public agency, the Quartz guide to bad data is a great place to search for your problem. Then follow ProPublica's Bulletproofing Data guide to make sure your data is strong.
Got your data ready to put into that story? It's time to figure out how to show what you've got to the reader. The Financial Times has a giant interactive and accompanying explainer for helping you pick what chart to use. Then get some expert advice on best design practices to keep your chart simple and readable by the reader.
But my favorite place for data viz advice is the Datawrapper blog Chartable, run by the same people who created the tool we use. They have a host of tutorials and advice for using colors, translating data, and using specific types of charts. Follow that space to build up your visual prowess and see some great chart examples created in Datawrapper.
When reporters describe their lingering fear of programming, they tend to describe coders as Cypher in the Matrix: A strange translator who can translate a long line of gibberish.
We coders do occasionally have to look at code without signposts or guidelines, and it's easier once you learn that language. But even though I'm pretty familiar with Javascript and Python, if you gave me a full chunk of complex code and didn't tell me what it was supposed to do or what part does what, I'd be as lost as you.
But we don't do that. Even if we steal -- ahem, borrow -- code from other sources, there are a couple of easy ways to see for yourself how a chunk of code gets turned into a result. For websites, there's even an easy cheat. It's called the Web Inspector.
Chrome and Firefox both have web inspectors, and you can access them the same way. Open the page you want to look at -- preferably on a nice big screen -- and right-click on the part of the site you want to inspect. It'll crop up as "Inspect" or "Inspect Element."
When you click on that, a whole of things crop up: The HTML, CSS and Javascript console, aka the bones, blood and skin of a webpage:
You can even edit and delete elements from the page, although it won't change how the page loads to someone else. It's fun to play around with, though. For example:
You can start by looking at your own article, or trying a well-done graphic you admire or tool you use a lot. Confused about what an element is or does? Google it! Or search for it in W3 Schools, an encyclopedia of HTML, CSS and JS codes.
Google Sheets is a bit different from Excel in syntax and tools, which can make it tough to transition to it if you're barely used to Excel yourself. However, it also has a ton of features that make it easy to use and can enchance your use of online data. You can:
- Scrape a webpage with ImportXML with ImportXML
- Quickly turn an HTML table into a spreadsheet (with ImportHTML) or get data from an online csv (with Importdata)
- Log an RSS feed with ImportFeed, or grab from an API if it's in the form of a JSON with ImportJSON
- Turn a Google form into a spreadsheet of responses, making it possible to analyze and chart survey data
- Automatically add suggested charts using Explore
- Use the Googletranslate function to translate foreign languages
- Create cute little sparklines in your table:
- Add a bit of finance data to your story with the Googlefinance function
- Get latitude and longitude from address data with Geocode's add-on. It can also plot a basic map, although you shouldn't embed this in a story.
- If you're a bit more advanced, you can grab a special version of the url of the data and use it to put live data into a chart or interactive.
If you've had trouble grasping the concepts of Excel, or want to strengthen your data journalism workflow, Workbench may be the answer. Workbench is a creation of the Columbia Journalism program designed to improve how reporters do data on a regular basis. But it has far more than Excel offers. Look at this demo of how they analyzed public data to create a chart on San Francisco's affordable housing:
The tutorial to do that is here. And one of the great things about Workbench is if you want to do the same thing, all you have to do is duplicate the workflow and change the data source to your own.
You can use Workbench to scrape websites, get Tweets or just generally do the things you'd do in Excel. You also save all of your steps, making it easy to trace your work and step back if you make an error.