Visualizing COVID19 Cases in the United States

Visualizing COVID19 Cases in the United States

The New York Times graphics team has been doing an incredible job of providing the public with data resources to help them better understand the pandemic. Over the past week or so they’ve produced a handful of incredibly interesting maps and other data visualizations including a map showing where cases are rising the fastest, a county-level view of who is and isn’t social distancing, and the map below, showing coronavirus cases by county.

The map shows the known locations of coronavirus cases by county. Circles are sized by the number of people there who have tested positive. The visualization is attributable to the New York Times, and the data used is as of 02.Apr.2020. Data is based on reports from state and local health agencies.

They have generously made their digital coverage of the virus, free to everyone, and they are also providing access to some of their data. Not only does the county-level data set, provide a more specific and localized visual of where the outbreak is happening, but it also provides the ability to visualize how the outbreak has spread within a county over time. This was one of the main objectives of my visualization; to provide the ability to see how the outbreak has changed over time, and also to provide an accurate representation of what localities have been hit the hardest, without visually understating or overstating the true nature of the problem.

The original NY Times visualization is itself quite good. The US albers projection shows the comprehensive nature of the US outbreak, including cases in all contiguous 48 states, as well as in Alaska, Hawaii, and Puerto Rico. If there’s one thing that perhaps could be improved it would be increasing the visibility into the specific counties/areas that are being hit the hardest in the Northeast. As it stands, the magnitude of the circle representing New York City envelops many of the other counties in New Jersy and New York.

A county-level chloropleth map would address the issue, as would a dot density map at the county level, where each dot represented an individual case. The only obstacle to the latter methodology was that the data was aggregated at the county level, and needed somehow to be disaggregated and then have the ensuing records assigned randomized geographic coordinates within the bounds of their associated counties.

Thankfully, I was not the first person to want to do this in Tableau. Sarah Battersby in her TC18 Map Hacking session described a methodology of accomplishing this using PostgreSQL and custom SQL in tableau, which is explained in more detail in the forum post found here.

Several years ago, Ned Harding of Alteryx also had provided a rather easy to implement way to accomplish this within Alteryx, which uses the Generate Rows Tool and then the ST_RandomPoint function to assign randomized coordinates within the bounds of a spatial object.

The key steps of the workflow used to produce the data for the dot density map are shown above.

Specific configurations for the tools are shown below:

For each county, and each day, the workflow will create additional rows corresponding to the cumulative count of cases

The ST_RandomPoint function is straightforward, and outputs a spatial object that can be plotted as a geometry field within Tableau

The ST_RandomPoint function is straightforward, and outputs a spatial object that can be plotted as a geometry field within Tableau

Once I had the shapefile output, I was able to bring it into tableau and inner join it with the US state, albers projection, shapefile that I produced using Mapshaper.org

A Single spatial datasource created by combining the state file and the the randomized point file

A Single spatial datasource created by combining the state file and the the randomized point file

From there, it was simply a matter of creating a dual axis map, using the GEOMETRY spatial fields from both of the joined shapefiles.

A simple dual axis map is created on top of a transparent map layer

The final product is shown below. Click through to view and explore on Tableau Public.

The visualization includes US data from 01.Mar.2020 to 01.Apr.2020. The underlying data will be updated as it is made available by the New York Times,

The visualization includes US data from 01.Mar.2020 to 01.Apr.2020. The underlying data will be updated as it is made available by the New York Times,

Let's Keep Things Flat: Visualizing Hope for a Gradual, Successful, Re-Opening from COVID-19

Let's Keep Things Flat: Visualizing Hope for a Gradual, Successful, Re-Opening from COVID-19

Designing on Principle: A #MakeoverMonday Design Analysis (Guest Post)

Designing on Principle: A #MakeoverMonday Design Analysis (Guest Post)