Skip to Main Content
Newmark J-School Research Center
Hours FAQ Research Help

Data Story Ideas for Reporters

Example: NYC Open Data + Department of City Planning (rats!)

1. Theme Sanitation. This can be a hot issue in NYC.  Think Pizza Rat.

 

2. Units > type Neighborhoods are a common unit of disaggregation in NYC open data and have a more conversational feel for reporting than zip codes or community districts.  Because of residential segregation, using neighborhoods may reveal racial impacts of municipal policies.

 

3. Metrics > stable

Are neighborhood tabulation areas similarly sized?  I was able to answer this question directly in NYC Open Data using visualizations (below).

What about population density? It is clear, if you have been to Laurelton and the Lower East Side that neighborhoods are not similarly dense. To quantify this I downloaded a spreadsheet from the Department of City Planning (below).

Following the steps below, I found that Neighborhood Tabulation Areas range from about 20,000 to 80,000 people with some notable outliers. (Link to visualization and list of outliers is below.)

This spreadsheet shows that Neighborhood Tabulation Areas (NTAs) range in density from 6 people per square acre up to 200 people per square acre, which is a pretty big range.  For the purpose of generating story ideas, I decided to compare sanitation-related data for NTAs with similar population and density, by selecting neighborhoods from a single quadrant in the scatterplot below.

I created the scatterplot by:

  • Opening the spreadsheet in Microsoft Excel.
  • Using copy/paste to put the density and population columns next to one another and selecting both columns.
  • Choosing insert > chart > scatter from the ribbon.
4. Units > Points For the purpose of generating story ideas, I decided to compare sanitation-related data for NTAs with similar population and density.

To restrict my data to just these points I used filters in Excel.

  • Home > Sort and filter > filter (ctrl + shift + L).
  • Field Pop_20 > Number filters > Between > Less than or equal to 50,000 > Greater than or equal to 40,000.
  • Field PopPerAcre_20 > Number filters > Between > Less than or equal to 90 > Greater than or equal to 60.

Because I couldn't easily visualize all ten neighborhoods, I used the Department of Planning's Population Factfinder to make a map.

5. Metrics > Variables

Passed/failed rat inspections.

  • Searched in NYC Open data for "garbage."
  • Selected "Rodent Inspection."

The data set is very large.  Excel will only open a file with about a million records or fewer, and a million records is probably enough for a data story idea. We want to be able to export the data in Excel so we can merge it with the Excel files we have that have neighborhood information.

Following the steps below, I found that initial rat inspections range from 160,000 to 180,000 per year. 

  • Visualize > Create Visualization > Timeline chart.
  • Filters > Inspection_type > Initial.
  • Filters > Inspection_date > From 2008 to 2029.
  • Filters > NTA > Is not "no value."

I filter to the most recent full year available (2022) to download. 

In Excel I use a pivot table to calculate the 2022 pass rate.  

  • Create a new column called "Pass" with the formula = IF ( [Result] = "Passed", 1, 0 )
  • Insert pivot table with "NTA" as the row labels and "Pass" as the values two times.  Once summarized by count, and once summarized by average.  The average is the pass rate.  The count is useful to get a sense of whether we are averaging 100 inspections or just 2.

I used the INDEX and MATCH formulas in Excel to merge the pass rates into the demographic data file.  You could also use the VLOOKUP formula.  If you are not comfortable with these formulas, you can copy and paste the data for the ten neighborhoods you have selected.

Finally I used the scatterplot feature in Excel to compare these ten neighborhoods' racial composition and rat inspection pass rates.

Story idea Dyker Heights and East Flatbush. These neighborhoods are similar in density and income.  Why does the one with fewer Black residents pass more rat inspections?