Billion-Dollar Climate Disasters

10 minute read

Billion Dollar Climate Disasters

A vectorized PDF of this poster can be downloaded here.

License

The license for the code and the visual: CC BY-SA 4.0.

Introduction

Billion-Dollar Weather and Climate Disasters is a collection of data on climate disasters costing at least $1B US. This data is collected by NOAA National Centers for Environmental Information (NECI). This poster explores different aspects of those disasters and is intended to be educational. The visuals used in this poster was created with r with ggplot2, and enhanced with Adobe Illustrator.

This poster was originally created as part of the final project for Information Visualization (IST 719) at Syracuse University.

Simple Techniques on Poster’s Design

1. Viewing distances

There are three viewing distances that should be considered when putting together a poster:

  • A eyecatching visual or words that can be seen from far away (i.e. the back of a classroom).
  • Visuals and annotations can be seen from about half way across the room.
  • Additional details that can be seen from a close distance

Keep those three distances in mind when selecting font sizes and visual sizes.

For the Billion Dollar Climate poster, the Sunburst chart was meant to be seen from far away along with the title. The different visuals and annotations are scaled to be seen from different distances.

2. Rule of the thirds

The rule of the thirds is the concept of dividing an image into a 3-by-3 grid. The intersection of the grid lines, or a third away from each of the four sides, are where people typically focus on when first viewing an image. A poster can utilize that space to help capture the eyes of the audience.

For the Billion Dollar Climate poster, the title of the poster was placed 1/3 of the way from the bottom left hand corner.

3. Color

Keep variation of color to a minimal. Color is a very powerful tool, but it can also be very distractive. With the right combination of color, one can help the audiences focus on the key message.

For the Billion Dollar Climate poster, only three primary colors were used. However, additional gradient of color was used to generate the map.

Untouched Visuals

The untouched visuals are showcased here:

Pictogram Sunburst Bar Circular Bar Donut Map
Gallery of the untouched visuals

They were all exported as PDF, enhanced and combined in Adobe Illustrator.

R Script

Below contains the R script that were used to generate the visuals used in this poster. Please note that this script uses the fiftystater package by Will Murphy and requires RTools to compile the package.

The R Markdown file can be obtained from Github here.

Set Up

The script detach all imported packages before importing packages. This was done to ensure that the environment is “fresh”. If a package is missing, the script would download it from CRAN (or Github).

knitr::opts_chunk$set(
    warning = F,
    echo = F
)

remove(list = ls(all.names = TRUE))

detachAllPackages <- function() {
  basic.packages.blank <-  c("stats", 
                             "graphics", 
                             "grDevices",
                             "utils", 
                             "datasets", 
                             "methods", 
                             "base")
  basic.packages <- paste("package:", basic.packages.blank, sep = "")
  
  package.list <- search()[ifelse(unlist(gregexpr("package:", search())) == 1, 
                                  TRUE, 
                                  FALSE)]
  
  package.list <- setdiff(package.list, basic.packages)
  
  if (length(package.list) > 0)  for (package in package.list) {
    detach(package, character.only = TRUE)
    print(paste("package ", package, " detached", sep = ""))
  }
}

detachAllPackages()

graphics.off() # clear the plots

if (!require(fiftystater, quietly=TRUE)) {
  if (!require(devtools)) {
    install.packages('devtools')
    library(devtools)
  }
  install_github('wmurphyrd/fiftystater')
}

pkgs <- c(
  'plyr',
  'tidyverse',
  'lubridate',
  'maps',
  'png',
  'grid',
  'mapproj',
  'stringr',
  'magrittr',
  'zoo',
  'here'
  )

if (length(pkgs) > 0) {
  for (pkg in pkgs) {
    if (!require(pkg,character.only=TRUE,quietly=TRUE)) {
      install.packages(pkg)
      library(pkg,character.only=TRUE)
    }
  }
}

Creating the folder paths used for this script.

dir.create(here('output'), showWarnings = FALSE)
dir.create(here('data'), showWarnings = FALSE)

data.path <- here('data')
output.path <- here('output')

Download Data

The data needed from NOAA is split into states. A simple for loop was created to download them all.


us_abb <- c(state.abb, 'US')

noaa_files <- paste0('time-series-', us_abb, '.csv')
noaa_files <- c(noaa_files, 
                'state-freq-data.csv', 
                'state-cost-data.csv', 
                'events-US-1980-2020.csv')

for (noaa_file in noaa_files) {
  f <- file.path(data.path, noaa_file)
  if (!file.exists(f)) {
    download.file(paste0('https://www.ncdc.noaa.gov/billions/', noaa_file),
                  f, quiet = TRUE)
  }
}

Data Processing

The code block below processes the different data files and compile them into a single DataFrame.


read_noaa_cost_and_frequency <- function() {
  df.freq <- read.csv(file.path(data.path, 'state-freq-data.csv'), skip = 1)
  df.cost <- read.csv(file.path(data.path, 'state-cost-data.csv'), skip = 1)
  
  colnames(df.freq) <- str_to_title(gsub('\\.', ' ', colnames(df.freq)))
  colnames(df.freq) <- paste0(gsub(' ', '_', colnames(df.freq)), '_Count')
  
  colnames(df.cost) <- str_to_title(gsub('\\.', ' ', colnames(df.cost)))
  colnames(df.cost) <- paste0(gsub(' ', '_', colnames(df.cost)), '_Cost')
  
  df.freq %<>%
    select(-Year_Count) %>%
    rename(State = State_Count) %>%
    group_by(State) %>%
    summarize_all(sum) %>%
    mutate(Year = 'All')
    
  df.cost %<>%
    rename(State = State_Cost) %>%
    group_by(State) %>%
    summarize_all(sum)
  
  return(merge(df.freq, df.cost, 'State'))
}

read_noaa_time_series <- function(state_abb) {
  
  f <- file.path(data.path, paste0('time-series-', state_abb, '.csv'))
  
  df <- read.csv(f,
                 skip=2)
  
  colnames(df) <- gsub("\\.", "_", colnames(df))
      
  if (state_abb == 'US') {
    df %<>%
      mutate(State = state_abb) %>%
      select(c('State', 'Year',
               'Drought_Count', 'Drought_Cost', 'Drought_Lower_95', 'Drought_Upper_95',
               'Flooding_Count', 'Flooding_Cost', 'Flooding_Lower_95', 'Flooding_Upper_95',
               'Freeze_Count', 'Freeze_Cost', 'Freeze_Lower_95', 'Freeze_Upper_95',
               'Severe_Storm_Count', 'Severe_Storm_Cost', 'Severe_Storm_Lower_95', 'Severe_Storm_Upper_95',
               'Tropical_Cyclone_Count', 'Tropical_Cyclone_Cost', 'Tropical_Cyclone_Lower_95', 'Tropical_Cyclone_Upper_95',
               'Wildfire_Count', 'Wildfire_Cost','Wildfire_Lower_95', 'Wildfire_Upper_95', 
               'Winter_Storm_Count', 'Winter_Storm_Cost', 'Winter_Storm_Lower_95', 'Winter_Storm_Upper_95'
               )) %>%
      mutate(State = state_abb) %>%
      rename_all(list(~ str_replace(., '_95', '_Cost')))
  } else {
    df %<>%
      mutate(State = state_abb) %>%
      select(c('State', 'Year',
               'Drought_Count', 'Drought_Cost_Range',
               'Flooding_Count', 'Flooding_Cost_Range',
               'Freeze_Count', 'Freeze_Cost_Range',
               'Severe_Storm_Count', 'Severe_Storm_Cost_Range',
               'Tropical_Cyclone_Count', 'Tropical_Cyclone_Cost_Range',
               'Wildfire_Count', 'Wildfire_Cost_Range',
               'Winter_Storm_Count', 'Winter_Storm_Cost_Range'
               )) %>%
      separate(Drought_Cost_Range, into = c('Drought_Lower_Cost', 'Drought_Upper_Cost'), sep = '-') %>%
      separate(Flooding_Cost_Range, into = c('Flooding_Lower_Cost', 'Flooding_Upper_Cost'), sep = '-') %>%
      separate(Freeze_Cost_Range, into = c('Freeze_Lower_Cost', 'Freeze_Upper_Cost'), sep = '-') %>%
      separate(Severe_Storm_Cost_Range, into = c('Severe_Storm_Lower_Cost', 'Severe_Storm_Upper_Cost'), sep = '-') %>%
      separate(Tropical_Cyclone_Cost_Range, into = c('Tropical_Cyclone_Lower_Cost', 'Tropical_Cyclone_Upper_Cost'), sep = '-') %>%
      separate(Wildfire_Cost_Range, into = c('Wildfire_Lower_Cost', 'Wildfire_Upper_Cost'), sep = '-') %>%
      separate(Winter_Storm_Cost_Range, into = c('Winter_Storm_Lower_Cost', 'Winter_Storm_Upper_Cost'), sep = '-') %>%
      mutate_at(vars(ends_with('Cost')), list(as.integer)) %>%
      mutate(Drought_Cost = (Drought_Lower_Cost + Drought_Upper_Cost)/2,
             Flooding_Cost = (Flooding_Lower_Cost + Flooding_Upper_Cost)/2,
             Freeze_Cost = (Freeze_Lower_Cost + Freeze_Upper_Cost)/2,
             Severe_Storm_Cost = (Severe_Storm_Lower_Cost + Severe_Storm_Upper_Cost)/2,
             Tropical_Cyclone_Cost = (Tropical_Cyclone_Lower_Cost + Tropical_Cyclone_Upper_Cost)/2,
             Wildfire_Cost = (Wildfire_Lower_Cost + Wildfire_Upper_Cost)/2,
             Winter_Storm_Cost = (Winter_Storm_Lower_Cost + Winter_Storm_Upper_Cost)/2) %>%
      mutate_at(vars(ends_with('Cost')), function (x) x/1000) %>%
      select(c('State', 'Year',
               'Drought_Count', 'Drought_Cost', 'Drought_Lower_Cost', 'Drought_Upper_Cost',
               'Flooding_Count', 'Flooding_Cost', 'Flooding_Lower_Cost', 'Flooding_Upper_Cost',
               'Freeze_Count', 'Freeze_Cost', 'Freeze_Lower_Cost', 'Freeze_Upper_Cost',
               'Severe_Storm_Count', 'Severe_Storm_Cost', 'Severe_Storm_Lower_Cost', 'Severe_Storm_Upper_Cost',
               'Tropical_Cyclone_Count', 'Tropical_Cyclone_Cost', 'Tropical_Cyclone_Lower_Cost', 'Tropical_Cyclone_Upper_Cost',
               'Wildfire_Count', 'Wildfire_Cost','Wildfire_Lower_Cost', 'Wildfire_Upper_Cost',
               'Winter_Storm_Count', 'Winter_Storm_Cost', 'Winter_Storm_Lower_Cost', 'Winter_Storm_Upper_Cost'
               ))
  }

  return(df)
}

read_us_events <- function() {
  df <- read.csv(file.path(data.path, 'events-US-1980-2020.csv'), skip = 1)
  
  df %<>%
    mutate(Begin.Date = strptime(Begin.Date, format = '%Y%m%d'),
           End.Date = strptime(End.Date, format = '%Y%m%d'),
           Year = year(Begin.Date))

  df %<>%
    group_by(Year, Disaster) %>%
    summarize(Cost = sum(Total.CPI.Adjusted.Cost..Millions.of.Dollars.)/1000,
              Count = n(),
              Death = sum(Deaths)) %>%
    mutate(State = 'US',
           Region = 'United States')
  
  return(df)
}

clean_data_set_further <- function(df) {
  df.freq <- df %>%
    select('State', 'Year' | ends_with("Count")) %>%
    pivot_longer(-c('State', 'Year'), names_to = 'Disaster', values_to = 'Count') %>%
    mutate(Disaster = gsub('_Count', '', Disaster),
           Disaster = gsub('_', ' ', Disaster))
  
  df.cost <- df %>%
    select('State', 'Year' | ends_with("Cost")) %>%
    pivot_longer(-c('State', 'Year'), names_to = 'Disaster', values_to = 'Cost') %>%
    mutate(Disaster = gsub('_Cost', '', Disaster),
           Disaster = gsub('_', ' ', Disaster))
  
  df <- merge(df.freq, df.cost, 
              sort = FALSE,
              by = c('State', 'Year', 'Disaster'))
  
  states_name_to_abb <- tibble(Region = state.name) %>%
    bind_cols(tibble(State = state.abb))
  
  df <- merge(states_name_to_abb, df,
              sort = FALSE,
              by = 'State')
  
  return(df)
}

df.raw <- plyr::ldply(.data = us_abb,
                    .fun = read_noaa_time_series)

df.raw %<>% select(c('State', 'Year',
                 'Drought_Count', 'Drought_Cost',
                 'Flooding_Count', 'Flooding_Cost',
                 'Freeze_Count', 'Freeze_Cost',
                 'Severe_Storm_Count', 'Severe_Storm_Cost',
                 'Tropical_Cyclone_Count', 'Tropical_Cyclone_Cost',
                 'Wildfire_Count', 'Wildfire_Cost',
                 'Winter_Storm_Count', 'Winter_Storm_Cost'))

df.raw <- clean_data_set_further(rbind(df.raw, read_noaa_cost_and_frequency()))

df.raw$Death = 0

df.raw <- rbind(df.raw, read_us_events())

This determines the dimension of the DataFrame.

dim(df.raw)

Chart Creation

Selection of color

The hex code for the three main colors used in this poster.

primary_color = '#f4a261'
secondary_color = '#2a9d8f'
bg_color = '#264653'

Pictogram of United States’ Deaths by Climate Disaster Type


df.us.death_by_disaster <- df.raw %>%
  filter(State == 'US', 
         Year < 2020) %>%
  group_by(Disaster) %>%
  summarize(Ten_Death_per_Year = sum(Death)/400) %>%
  arrange(Ten_Death_per_Year)

fill_images <- function() {
  l <- list()
  
  for (r in 1:nrow(df.us.death_by_disaster)) {
    for (d in 1:ceiling(df.us.death_by_disaster$Ten_Death_per_Year[r])) {
      img <- readPNG(file.path(data.path, 'person.png'))
      g <- rasterGrob(img, interpolate = T)
      l <- c(l, annotation_custom(
        g,
        xmin = r-1/2, 
        xmax = r+1/2,
        ymin = d-1, 
        ymax = d))
    }
  }
  l
}

clip_images <- function() {
  l <- list()
  
  for (r in 1:nrow(df.us.death_by_disaster)) {
    l <- c(l, geom_rect(
      xmin = r-1/2, 
      xmax = r+1/2,
      ymin = df.us.death_by_disaster$Ten_Death_per_Year[r], 
      ymax = ceiling(df.us.death_by_disaster$Ten_Death_per_Year[r]),
      color = 'white', fill = 'white'))
  }
  l
}

df.us.death_by_disaster

ggplot(df.us.death_by_disaster, 
            aes(reorder(Disaster, Ten_Death_per_Year), Ten_Death_per_Year)) + 
  fill_images() + 
  clip_images() +
  scale_y_continuous(breaks=seq(0, 20, 2)) + 
  scale_x_discrete() + 
  theme_bw() + 
  theme(legend.position = 'none',
        axis.title.x = element_blank(),
        axis.title.y = element_blank(),
        axis.ticks = element_blank()) +
  coord_flip()

ggsave(file.path(output.path,'pictogram_death_by_disaster.pdf'),
       width = 10, height = 6, units = 'in')

Sunburst Chart of United States’ Frequency of Climate Disasters by Year

df.us.count_by_disaster <- df.raw %>%
  filter(State == 'US', 
         Year < 2020) %>%
  mutate(Year = as.integer(Year)) %>%
  group_by(Year) %>%
  summarize(
    Count = sum(Count)) %>%
  complete(Year = c(1980:2019)) %>%
  mutate(Count = coalesce(Count, 0),
         Decade = as.factor(paste0(as.character(Year - (Year %% 10)),'s')))

empty_bar <- 1
to_add <- data.frame(
  matrix(NA, 
         empty_bar*nlevels(df.us.count_by_disaster$Decade), 
         ncol(df.us.count_by_disaster)))
colnames(to_add) <- colnames(df.us.count_by_disaster)
to_add$Decade <- rep(levels(df.us.count_by_disaster$Decade), each=empty_bar)

df.us.count_by_disaster <- rbind(df.us.count_by_disaster, to_add)

empty_bar <- 13
to_add <- data.frame(
  matrix(NA, 
         empty_bar, 
         ncol(df.us.count_by_disaster)))
colnames(to_add) <- colnames(df.us.count_by_disaster)
to_add$Decade <- first(df.us.count_by_disaster$Decade)

df.us.count_by_disaster <- rbind(to_add, df.us.count_by_disaster)

df.us.count_by_disaster <- df.us.count_by_disaster %>% arrange(Decade)
df.us.count_by_disaster$id <- seq(1, nrow(df.us.count_by_disaster))

df.us.count_by_disaster <- df.us.count_by_disaster %>% arrange(-id)
df.us.count_by_disaster$id <- seq(1, nrow(df.us.count_by_disaster))
label_data <- df.us.count_by_disaster
number_of_bar <- nrow(label_data)
angle <- 90 - 360 * (label_data$id-0.5) /number_of_bar
label_data$hjust <- ifelse( angle < -90, 1, 0)
label_data$angle <- ifelse(angle < -90, angle+180, angle)

# Make the plot
ggplot(df.us.count_by_disaster, aes(x=as.factor(id), y=Count, fill=primary_color)) +
  geom_bar(aes(x=as.factor(id), y=Count, fill=primary_color), stat='identity') +
  ylim(-5,17) +
  theme_minimal() +
  theme(
    legend.position = "none",
    axis.text = element_blank(),
    axis.title = element_blank(),
    panel.grid = element_blank(),
    plot.margin = unit(rep(-1,4), 'cm')
  ) +
  coord_polar() +
  geom_text(data=label_data, aes(x=id, y=Count+0.5, label=as.character(Year), hjust=hjust), color='black', fontface='bold',alpha=1, size=12, angle= label_data$angle, inherit.aes = FALSE )

ggsave(file.path(output.path,'circular_pie_frequency_by_year.pdf'),
       width = 24, height = 24, units = 'in')

Bar Chart of United States’ Spend on Climate Disasters by Year

df.us.cost_per_year <- df.raw %>%
  mutate(Year = as.integer(Year)) %>%
  filter(State == 'US', 
         Year < 2020) %>%
  group_by(Year) %>%
  summarize(Cost = sum(Cost)) %>%
  complete(Year = c(1980:2019)) %>%
  mutate(Cost = coalesce(Cost, 0),
         Decade = as.factor(paste0(as.character(Year - (Year %% 10)),'s')))

df.us.cost_per_year %>% group_by(Decade) %>% summarize(Cost = mean(Cost))

ggplot(df.us.cost_per_year, aes(x = Year, y = Cost)) +
  geom_bar(stat = 'identity', fill = secondary_color) +
  geom_smooth(formula = y~x, method = 'loess', se = FALSE, color = primary_color) +
  theme_bw() +
  theme(legend.position = 'none',
        axis.title.x = element_blank(),
        axis.title.y = element_blank(),
        axis.ticks = element_blank())

ggsave(file.path(output.path,'bar_cost_per_year.pdf'),
       width = 7, height = 3, units = 'in')

Circular Bar Chart of Cost by Climate Disaster Type

The reason this was made into a circular bar chart instead of a regular bar chart is because of spacing.

max_percent = 62.5

df.us.cost_by_disaster <- df.raw %>%
  filter(State == 'US') %>%
  group_by(Disaster) %>%
  summarize(Cost = sum(Cost)) %>%
  mutate(Percent = Cost/1000*max_percent)


empty_bar <- 1
to_add <- data.frame(
  matrix(0,
         empty_bar,
         ncol(df.us.cost_by_disaster)))
colnames(to_add) <- colnames(df.us.cost_by_disaster)
to_add$Disaster <- letters[1:empty_bar]


df.us.cost_by_disaster <- rbind(df.us.cost_by_disaster, to_add)

ggplot(df.us.cost_by_disaster, aes(x = reorder(Disaster, Cost), y = Percent)) +
  geom_bar(width = 0.85, stat = 'identity', fill = secondary_color) +
  coord_polar(theta = 'y') +
  ylim(c(0,100)) +
  geom_text(data = df.us.cost_by_disaster, 
            hjust = 1, 
            size = 3, 
            aes(x = Disaster, 
                y = 0, 
                label = Disaster))+
  theme_bw() +
  theme(legend.position = 'none',
        axis.text.x = element_blank(),
        axis.text.y = element_blank(),
        axis.title.x = element_blank(),
        axis.title.y = element_blank(),
        axis.ticks = element_blank())

ggsave(file.path(output.path,'radial_bar_cost_by_disaster.pdf'),
       width = 5, height = 5, units = 'in')

Map of Frequency Change in Climate Disasters by State

This code block also creates a donut chart of the percentage of states that saw an increase in frequency of climate disasters.

tiles <- c(-1,0,1,2,3,12)

df.state.freq_by_decade <- df.raw %>%
  filter((!Year %in% c(2020, 'All')),
         (State != 'US')) %>%
  mutate(Year = as.integer(Year),
         Decade = Year - (Year %% 10)) %>%
  group_by(Decade, Region) %>%
  summarize(Count = sum(Count)) %>%
  filter(Decade %in% c(1980, 2010)) %>%
  spread(Decade, value = Count) %>%
  mutate(Difference = `2010` - `1980`,
         Percent_Increase = ifelse(`1980`>0,Difference/`1980`,`2010`),
         Group = ifelse(Difference > 0, 'Increase', ifelse(Difference == 0, 'No Change', 'Decrease')),
         freq_quantitle = cut(
           Percent_Increase,
           breaks = tiles,include.lowest = FALSE))

df.state.freq_change <- df.state.freq_by_decade %>%
  mutate(Group = as.factor(Group)) %>%
  group_by(Group) %>%
  summarize(Count = n()) %>%
  mutate(fraction = Count/sum(Count),
         ymax = cumsum(fraction),
         ymin = c(0, head(ymax, n=-1)),
         labelPosition = (ymax + ymin) / 2,
         label = paste0(Group, '\n', Count, ' States'))

ggplot(df.state.freq_change,
       aes(ymax=ymax,
           ymin=ymin,
           xmax=3.5,
           xmin=1.75,
           fill=as.factor(Group))) +
  geom_rect() +
  geom_text(x=5, aes(y=labelPosition, label=label, color=Group), size=6) +
  scale_fill_manual(values = c('Increase' = primary_color, 'No Change' = '#FFFFFF', 'Decrease' = secondary_color)) +
  scale_color_manual(values = c('Increase' = primary_color, 'No Change' = '#FFFFFF', 'Decrease' = secondary_color)) +
  coord_polar(theta="y") +
  xlim(c(-1, 5)) +
  theme_void() +
  theme(legend.position = "none")

ggsave(file.path(output.path,'donut_freq_change_by_state.pdf'),
       width = 8, height = 8, units = 'in')

ggplot(df.state.freq_by_decade, aes(map_id = str_to_lower(Region))) +
  geom_map(aes(fill = freq_quantitle), map = fifty_states) +
  expand_limits(x = fifty_states$long, y = fifty_states$lat) +
  scale_fill_manual(values = c('#FFFFFF', '#FDE9D9', '#FAD3B3', '#F7BC8D', primary_color)) + 
  coord_map('polyconic') +
  scale_x_continuous(breaks = NULL) +
  scale_y_continuous(breaks = NULL) +
  labs(x = '', y = '') +
  theme_minimal() +
  theme(
    axis.line = element_blank(),
    axis.text.x = element_blank(),
    axis.text.y = element_blank(),
    axis.title.x = element_blank(),
    axis.title.y = element_blank(),
    axis.ticks = element_blank(),
    legend.position = 'bottom'
  )

ggsave(file.path(output.path,'map_percent_frequency_increase_by_state.pdf'),
       width = 20, height = 10, units = 'in')