Skip to main content

Link to code

Date: April 14, 2026


Authors/Creators/ Team Members: Victoria D. Lynch, Jonathan A. Sullivan, Aaron B. Flores, Xicheng Xie, Sarika Aggarwal, Rachel C. Nethery, Marianthi-Anna Kioumourtzoglou, Anne E. Nigra, and Robbie M. Parks

Specific purpose of code: This code links National Center for Health Statistics (NCHS) mortality data with Global Flood Database (GFD) flood event data from 2001 – 2018 by US county. We used the NCHS data to identify monthly total and cause-specific deaths by age group, sex, and county and used Global Human Settlement Layer (GHSL) population data to calculate county-level flood exposure and mortality rates. We used a Bayesian formulation of the conditional quasi-Poisson model to analyze the county-level association between the number of flood events per month and monthly death rates, accounting for overdispersion in the mortality data. The conditional approach examines differences within matched strata (here, county-months) like a case-crossover study design, which removes confounding bias due to factors that vary across strata. Bayesian inference enables the ‘borrowing of information’ across county units and for the full distributional estimation of the parameters of interest.
All-cause and cause-specific mortality associated with flood events is likely differential by flood cause and severity; therefore, we conducted analyses separately by all-cause and cause-specific mortality (cancers, cardiovascular diseases, infectious and parasitic diseases, injuries, neuropsychiatric conditions, and respiratory diseases), flood cause (all floods, heavy rain, tropical cyclone, snowmelt, and ice jam or dam break), and flood severity (mild, moderate, severe, and very severe). Because very severe flood events were most strongly associated with increased mortality across all flood causes and mortality groups, we further analyzed associations stratified by age group (0-64 and 65+ years) and sex (female and male) for very severe floods only.

General Application: This code links county-level flood exposure, categorized by flood type, with county-level mortality rates for the six primary causes of death in the US: cancers, cardiovascular disease, infectious and parasitic diseases, injuries, neuropsychiatric conditions, and respiratory diseases. The code could be used with any county-level health outcome and, with modification, with other county-level environmental exposures. The code specifically categorized exposure by flood type and severity, which would not apply to other exposures.

How does or could this code allow researchers to assess research questions related  to aging or life course?: The code is written to assess the association between flood exposure and mortality by age category; in our paper, we specifically stratified by age category (0-64 years old and 65+ years old) to examine flood exposure-related mortality among older adults. The NCHS data include individual-level age at death and would enable analyses with any subset of age groups.

Data sets used: 

  • Population, socioeconomic, or health data:
    National Center for Health Statistics (NCHS) mortality data; Global Human Settlement Layer (GHSL) population exposure.
  • Climate, weather, disaster or environment data:
    Global Flood Database (GFD) flood event data; Dartmouth Flood Observatory (DFO) flood classification data; Parameter-elevation Regression on Independent Slopes Model (PRISM) temperature data.

Are all the data publicly available or are some restricted-access? Data on flood exposure are available without restrictions for individual flooding events. Temperature data and population data are also publicly available.

NCHS mortality data are restricted. To access the NCHS mortality data, applicants must submit a project review form:(https://www.cdc.gov/nchs/data/nvss/nchs-research-review-application.pdf) to nvssrestricteddata@cdc.gov and allow four to six weeks for processing.

Links to data:

  1. National Center for Health Statistics, Mortality Data
  2. Global Flood Database and Global Human Settlement Layer (downloaded together)
  3. Parameter-elevation Regression on Independent Slopes Mode

Coding Language:  R 

Tools and Packages used:

R: acs, BiocManager, dlnm, dplyr, ecm, Epi, fiftystater, foreign, fst, ggpubr, ggplot2, graph, graticule, haven, here, janitor, lubridate, mapproj, maptools, mapview, MetBrewer, pipeR, raster, RColorBrewer, readxl, rgdal, rgeos, rnaturalearth, rnaturalearthdata, scales, sf, sp, sqldf, survival, splines, table1, tidycensus, tidyverse, totalcensus, usmap, zipcodeR, zoo, INLA, Rgraphviz, fmesher

Output(s): Exploratory data analysis of flood and mortality data (maps, figures, tables), and output of statistical analysis (figures, tables)

Spatial extent: United States

Temporal extent: 2001-2018

Published papers that use this code: Lynch, Victoria D., et al. “Large floods drive changes in cause-specific mortality in the United States.” Nature Medicine 31.2 (2025): 663-671. doi: https://doi.org/10.1038/s41591-024-03358-z