Beautiful Maps with MazamaSpatialPlots

Many of us have become addicted to The NY Times COVID maps — maps of US state or county level data colored by cases, vaccinations, per capita infections, etc. While recreating maps like these in R is possible, it is disappointingly difficult. The just released MazamaSpatialPlots R package takes a first stab at remedying this situation.

NY Times Maps

One of the key features of the maps displayed on the NY Times web site is the use of an appropriate projection for each nation or state being shown. Another feature is the use of thin, light colored lines around states and counties. The result is both informative and attractive.

NY Times COVID hot spots

Creating Beautiful Maps in R

We would like to produce similarly attractive state- and county-level maps in R with as little effort as possible. There is a lot of state- and county-level tabular data out there that can be easily harvested and the MazamaSpatialPlots package makes it easy to convert tables of data into attractive maps.

Here is a start-to-finish example that: 1) ingests county level data; 2) adds required columns of state and county identifiers; and 3) creates an attractive map:

library("MazamaSpatialPlots")

# Ingest county level data
#   See:  https://healthinequality.org/dl/health_ineq_online_table_12_readme.pdf
URL <- "https://healthinequality.org/dl/health_ineq_online_table_12.csv"
characteristicsData <- read.csv(URL)

# Added required 'stateCode' and 'countyFIPS' variables
characteristicsData <- 
  characteristicsData %>%
  dplyr::mutate(
    stateCode = stateabbrv,
    countyFIPS = MazamaSpatialUtils::US_countyNameToFIPS(stateCode, county_name),
    pUninsured2010 = puninsured2010,
    .keep = "none"
  ) 

# Create map
countyMap(
  data = characteristicsData,
  parameter = 'pUninsured2010',
  legendTitle = 'Uninsured (%)',
  title = "Percentage of population uninsured in 2010"
)

A few additional features of the package are demonstrated in the next two plots:

  • state or county level maps
  • subset by state or group of states
  • automatic calculation of the most appropriate projection
  • important attributes are configurable in the top level function calls
  • graphical object is a tmap and a ggplot object that can be further customized with those packages

Details

Spatial Data

The MazamaSpatialPlots package is built on top of MazamaSpatialUtils and utilizes harmonized spatial datasets from that package that must be installed. These include US Census state and county datasets that have been simplified to 1, 2 and 5% of the original size so that you can create a national map quickly at a lower level of detail yet still show lots of detail for smaller areas if you want.

Preparing Input Data

Any data frame can be passed to the stateMap() and countyMap() functions as long as it meets the following criteria:

  • No more than one record should be present for each state or county.
  • State level data must include a stateCode column with the 2-character postal abbreviation (aka ISO 3166-2 alpha-2).
  • County level data must include a countyFIPS column with the 5-digit county FIPS code.

You can use MazamaSpatialUtils conversion functions to help with the creation of these columns.

You can use functions from readr to quickly ingest tabular data or MazamaCoreUtils::html_getTable() to scrape tabular data from a web page.

There is a ton of useful data out there and our goal is to make it a very simple task to convert that data into attractive, informative maps.

Customizing Plots

The function signature for countyPlot() shows top level configurable parameters:

countyMap(
data = NULL,
parameter = NULL,
state_SPDF = "USCensusStates_02",
county_SPDF = "USCensusCounties_02",
palette = "YlOrBr",
breaks = NULL,
style = ifelse(is.null(breaks), "pretty", "fixed"),
showLegend = TRUE,
legendOrientation = "vertical",
legendTitle = NULL,
conusOnly = TRUE,
stateCode = NULL,
projection = NULL,
stateBorderColor = "gray50",
countyBorderColor = "white",
title = NULL
)

The plotting inside MazamaSpatialPlots utilizes the excellent tmap package for “thematic mapping” which is in turn built on top of ggplo2. So there is a tremendous amount of customization that can be done through those packages. Numerous examples are provided at the MazamaSpatialPlots website.

Best of luck creating beautiful maps!

0 thoughts on “Beautiful Maps with MazamaSpatialPlots

    • As a version 0.1 release, we have chosen to default to CONUS (CONtinental US) because those are the maps that are most familiar. You can include Alaska and Hawaii with `conusOnly = FALSE` but the projection is not a familiar one. Getting Alaska and Hawaii (and Puerto Rico) tucked into the corners of the map requires manipulations that are beyond our first-pass goals.

  • Dear Jonathan ,

    Hello!
    I was unable to install the package “MazamaSpatialPlots” but the other three were successfully installed (I’m using the newest versions of both R and RStudio). This is part of the message i got on rstudio:

    “————————————————————————————————————
    * installing *source* package ‘MazamaSpatialPlots’ …
    ** package ‘MazamaSpatialPlots’ successfully unpacked and MD5 sums checked
    ** using staged installation
    ** R
    ** data
    *** moving datasets to lazyload DB
    ** inst
    ** byte-compile and prepare package for lazy loading

    Error: .onLoad failed in loadNamespace() for ‘units’, details:
    call: udunits_init(path)
    error: no database found!
    Execution halted

    ERROR: lazy loading failed for package ‘MazamaSpatialPlots’
    * removing ‘C:/Users/***** Cruz/Documents/R/win-library/4.1/MazamaSpatialPlots’

    Warning in install.packages :
    installation of package ‘MazamaSpatialPlots’ had non-zero exit status
    “——————————————————————————————————————-”

    I’m a newbie in R so, if this is just a newbie mistake, i apologize.

  • Good article but I’m confused. Is there supposed to be code chunks for the “next two plots?” Am I suposed to add library(MazamaSpatialUtils) and creat my own data and parameter files to get second code group to run?

      • Ok. I”ve been to the github page and started going through it. Nice stuff. I’ll let you know if I see anything.. One minor point on primary article. You based 2010 uninsured data on 2020 population data. Should there be a note in the map?? I know it’s an example, but some people are pedantic about that stuff.

      • Rereading the descriptions at https://healthinequality.org/dl/health_ineq_online_table_12_readme.pdf, it is not clear how the variable named `puninsured2010` is calculated. We made an assumption that the “2010” was relevant.

        If this graphic were part of a detailed report, we would be much more careful. However, in this case, we are using the data from healthinequality.org simply because they have nice, ready-made, tabular data with interesting statistics at the county level. Our goal was simply to show that, when such data exists, it only takes a few lines of code to read it in and generate a map.

      • Understand. Never enough detail in the metadata. Figure probably based on 2010 decennial count and haven’t updated survey, yet . That’s the problem with partial updates.

      • I really liked the samples you used on the github in “Opportunity Insights”. I first started walking through the copying samples one by one and they errored out, Suddenly realized ther was no library chunk. If I downloaded the entire .rmd file, no problem. May want to add a library chunk to the sample for NUGs like me:
        “`{r}
        library(dplyr)
        library(MazamaSpatialPlots)
        library(readr)
        “`
        But, nice job on the samples.

Leave a Reply