Skip to main content

Introducing the Temperature Extremes in Europe (TEE) Datasets

 Introducing the Temperature Extremes in Europe (TEE) Datasets

Link to code

Click here

Date: November 2025


Authors/Creators/ Team Members: Sara R. Ronnkvist, Zoe Haskell-Craig, Risto Conte Keviabu, Abbie Robinson, Mathew E. Hauer, Domenico Bovienzo, Emilio Zagheni

Specific purpose of code: This GitHub repository contains a collection of scripts that generate datasets quantifying extreme temperature exposure in Europe using a variety of metrics at two sub-national spatial scales (NUTS 2 and NUTS 3) and three temporal scales (daily, extreme temperature wave, and yearly) from 1980-2024. These datasets capture the breadth of temperature metrics used in epidemiology, demography and environmental literature with 67 different metrics: including regionally-unusual temperature events (defined as temperatures above/below the 95th/5th percentile of historical temperatures) and periods of sustained (consecutive day) exposure to extreme temperatures. Additionally, these scripts can be adapted to construct temperature extremes for other geographic regions or scales with a few minor revisions. 

General Application: The TEE datasets can be linked to any data that contains NUTS identifiers(e.g.Eurostat)using a simplemergeto study the impacts of extreme temperatures on populations.

How does or could this code allow researchers to assess research questions related  to aging or life course?: The TEE datasets provide temperature data in a user-friendly format which can easily be linked to EuroStat or other datasets with NUTS identifiers. Additionally, our code is reproducible and easily adaptable to other geographic regions and/or time frames. Researchers who wish to study other regions may adapt our code to construct extreme temperature measures.

Data sets used: 

Are all the data publicly available or are some restricted-access? Publicly available.

Links to data: TEE dataset are available on FigSharehttps://springernature.figshare.com/articles/dataset/Temperature_Extremes_in_Europe_TEE_/28063226

Coding Language: We use Python to download the extreme temperature data from Copernicus and aggregate the hourly data to daily temperature measures. We use R for everything else.

Tools and Packages used:

R: terra, exactextractr, sf, tidyverse, foreign, zoo 

Python: os, glob, shutil, zipfile, cdsapi,  xarray, numpy 

Output(s):

16 datasets: 

  • Annual measures of extreme temperature 
  • tee_yearly_nuts2.csv 
  • tee_yearly_nuts3.csv 
  • Extreme temperature waves (consecutive days of temperature extremes) 
  • tee_wave_nuts2.csv 
  • tee_wave_nuts3.csv 
  • Daily temperature measure datasets split into 10-year files to reduce file size 
  • tee_daily_nuts2_[START YEAR]_[END YEAR].csv 
  • tee_daily_nuts3_[START YEAR]_[END YEAR].csv 

Spatial extent: European Union countries

Temporal extent: 1980-2024 

Comments: Ronnkvist et al (2025) contains detailed information on how the datasets were constructed and the included metrics. We provide replication instructions in the TEE-dataset GitHub repository (https://github.com/haskellcraigz/TEE-dataset/tree/main). 

Citation: 

Ronnkvist, S.R., Haskell-Craig, Z., Robinson, A., Conte Keivabu, R., Hauer, M.E., Bovienzo, D., & Zagheni, E. (2025) What’s the TEE: Metrics of Temperature Extremes in Europe NUTS Regions (1980-2024). Scientific Datahttps://doi.org/10.1038/s41597-025-05352-7 

Published papers that use this code:

Scientific Data publication associated with the TEE datasets: 

Ronnkvist, S.R., Haskell-Craig, Z., Robinson, A., Conte Keivabu, R., Hauer, M.E., Bovienzo, D., & Zagheni, E. (2025) What’s the TEE: Metrics of Temperature Extremes in Europe NUTS Regions (1980-2024). Scientific Datahttps://doi.org/10.1038/s41597-025-05352-7 

Continue reading

Mapping Flooding and Population Exposure: The Global Flood Database  

Mapping Flooding and Population Exposure: The Global Flood Database  

Link to data

Click here

Prepared by: Kathryn Foster, Cornell University 

Date: April 2026


Original Authors: B. Tellman, J.A Sullivan, C.S Doyle, C. Kuhn, A.J Kettner, G.R Brakenridge, T.A. Erikson, D.A. Slayback 

About: The data combines satellite flood and human settlement data to identify population exposure to flooding and flood risk from 2000 onward. The data were created using NASA satellite imagery to identify and map flood events recorded by the Dartmouth Flood Observatory (DFO), which were then intersected with global watershed data (HydroSHEDS) and daily precipitation estimates (PERSIANN-CDR). The maps were then overlaid with population data from the Global Human Settlement Layer (GHSL) to derive population exposure estimates.

Data are available on: Flood events, population exposure per flooding event, population exposed per country-event, and population displaced per event. The data include 913 flood events from 169 countries from 2000 to 2018. Utilizing satellite data, this database covers 2.23 million square kilometers of inundated land and helps track flood exposure, accounting for population change in flood-prone regions.

Spatial data (GeoTIFF files) by country and flooding events can be downloaded from the bottom of the Global Flood Database website in the interactive maps section. These maps outline the flooded areas and the duration of the flood for each event, as well as impacts on displacement and casualties. The DFO estimates both the cause of the flood event and the casualties. This is the number reported by the media or the government and could be much higher or lower than the estimated number of people exposed from satellite data. 

Population exposure per flooding event is calculated by intersecting the observed inundated flood data with the population data. The population exposed per event is reported using the GHSL population estimated in 2000, and in 2015. Tabular data in csv format can be found in the “About the data” link at the top of the Global Flood Database website.  

Although the database does not separate population exposure by age for each flooding event, researchers can still study older adults using the data. For example, researchers may combine the Global Flood Data with Census or population data of interest to assess the relationship between flood exposure and older adults (example found here).  

Detailed GIS data descriptions and methodological notes can be found here: https://storage.googleapis.com/gfd_metadata/README_GFD.pdf 

Citation: Tellman, B. et al. Satellite imaging reveals increased proportion of population exposed to floods. Nature596, 80–86 (2021). 

 

Continue reading

Creating “bins” for extreme temperature data and adjusting for known bias

Creating “bins” for extreme temperature data and adjusting for known bias

Link to code

Click here

Prepared by: Alex Mikulas, PhD, CACHE postdoctoral associate 

Date: April 2026


Original Authors:

Benjamin Jones, Northwestern University 

Jacob Moscona, Massachusetts Institute of Technology 

Benjamin A. Olken, Massachusetts Institute of Technology  

Cristine von Dessauer, Massachusetts Institute of Technology 

Specific purpose of code: This Stata code and program take temperature data at fine temporal and spatial resolution (ie: tract-day) and transform it to an aggregated panel dataset with temperature binned to year-place or month-place specificity. The program calculates both realized and expected number of days in each temperature bin. The data are then ready for use in regression and similar analysis using standard temperature bin specifications. The data are useful in better capturing the role of extreme temperatures by identifying extreme temperature days outside of what would be expected in a given area and are useful for assessing the causal role of increasing extreme temperature exposure.

General Application: Extreme temperature exposure is often operationalized in research using a binning procedure, wherein a researcher aggregates the number of extreme temperature days into binned temperature ranges. A common and serious bias can occur when using binned temperature data over time if the outcome variable of interest is associated with the baseline temperature, producing what are often called “U-shaped” results. When studying the impacts of climate change and increasing extreme temperature exposure, common binning procedures neglect the baseline temperature of a given area.

Let’s say that over a 20-year span, the average temperature increase is uniform across space. A place like Phoenix will see a large increase in extreme heat days, (say, 90-degree+ days) while a place like Boston will see a smaller increase in extreme heat days. This is because the baseline temperature for Phoenix 20 years ago was much warmer than the baseline temperature in Boston.

The bias arises when a given outcome is associated with both extreme heat days, and also the baseline temperature of a given area. If a baseline temperature of an area is closer to the bin thresholds for “extreme heat”, the outcome may be associated with the baseline temperature as well as the increase in extreme heat days, introducing statistical bias into an analysis.

This code addresses the issue by providing the “expected” number of days in each temperature bin, as well as the observed number of days in each temperature bin. These data can then control for different baseline temperatures and trends in warming for different areas. These expected and observed temperature bins allow the researcher to avoid regressing “trends on trends”, which estimate the biased U-shaped results. The estimated area-year and area-month temperature bin data can be used in a wide variety of extreme temperature exposure studies.

How does or could this code allow researchers to assess research questions related  to aging or life course?: This code can be applied to spatially and temporally specific temperature datasets to bin temperature exposure data while also accounting for varying baseline temperatures and varying trends in extreme temperature over time. Health and aging scholars can then integrate these data as a weather exposure variable and more accurately predict the impact of extreme temperature on age and health related outcomes.

Data sets used: 

  • Population, socioeconomic, or health data: code generates binned temperature data that can be integrated with any data that has spatial and temporal specificity (such as lat/long, geographic administrative identifiers, panel data, or data with observation dates).
  • Climate, weather, disaster or environment data: code is applicable to any temperature data with detailed place/day resolution.

Are all the data publicly available or are some restricted-access? NA

Links to data: NA

Coding Language: Stata 

Tools and Packages used: cftemp, a custom Stata command in an ado. file.

Output(s): Dataset of observed and counterfactual data of binned counts of extreme temperature days at the place/year or place/month level. Output comes from any user supplied, hyper-specific daily temperature datasets. 

Spatial extent: Flexible. Depends on user supplied data.

Temporal extent: Flexible. Depends on user supplied data.

Published papers that use this code: Benjamin Jones, Jacob Moscona, Benjamin A. Olken, and Cristine von Dessauer, “With or Without U? Binning Bias and the Causal Effects of Temperature Extremes,” NBER Working Paper 34671 (2026), https://doi.org/10.3386/w34671. 

Continue reading

Code linking NCHS mortality data with GFD flood event data

Code linking NCHS mortality data with GFD flood event data

Link to code

Click here

Date: April 2026


Authors/Creators/ Team Members: Victoria D. Lynch, Jonathan A. Sullivan, Aaron B. Flores, Xicheng Xie, Sarika Aggarwal, Rachel C. Nethery, Marianthi-Anna Kioumourtzoglou, Anne E. Nigra, and Robbie M. Parks

Specific purpose of code: This code links National Center for Health Statistics (NCHS) mortality data with Global Flood Database (GFD) flood event data from 2001 – 2018 by US county. We used the NCHS data to identify monthly total and cause-specific deaths by age group, sex, and county and used Global Human Settlement Layer (GHSL) population data to calculate county-level flood exposure and mortality rates. We used a Bayesian formulation of the conditional quasi-Poisson model to analyze the county-level association between the number of flood events per month and monthly death rates, accounting for overdispersion in the mortality data. The conditional approach examines differences within matched strata (here, county-months) like a case-crossover study design, which removes confounding bias due to factors that vary across strata. Bayesian inference enables the ‘borrowing of information’ across county units and for the full distributional estimation of the parameters of interest.
All-cause and cause-specific mortality associated with flood events is likely differential by flood cause and severity; therefore, we conducted analyses separately by all-cause and cause-specific mortality (cancers, cardiovascular diseases, infectious and parasitic diseases, injuries, neuropsychiatric conditions, and respiratory diseases), flood cause (all floods, heavy rain, tropical cyclone, snowmelt, and ice jam or dam break), and flood severity (mild, moderate, severe, and very severe). Because very severe flood events were most strongly associated with increased mortality across all flood causes and mortality groups, we further analyzed associations stratified by age group (0-64 and 65+ years) and sex (female and male) for very severe floods only.

General Application: This code links county-level flood exposure, categorized by flood type, with county-level mortality rates for the six primary causes of death in the US: cancers, cardiovascular disease, infectious and parasitic diseases, injuries, neuropsychiatric conditions, and respiratory diseases. The code could be used with any county-level health outcome and, with modification, with other county-level environmental exposures. The code specifically categorized exposure by flood type and severity, which would not apply to other exposures.

How does or could this code allow researchers to assess research questions related  to aging or life course?: The code is written to assess the association between flood exposure and mortality by age category; in our paper, we specifically stratified by age category (0-64 years old and 65+ years old) to examine flood exposure-related mortality among older adults. The NCHS data include individual-level age at death and would enable analyses with any subset of age groups.

Data sets used: 

  • Population, socioeconomic, or health data:
    National Center for Health Statistics (NCHS) mortality data; Global Human Settlement Layer (GHSL) population exposure.
  • Climate, weather, disaster or environment data:
    Global Flood Database (GFD) flood event data; Dartmouth Flood Observatory (DFO) flood classification data; Parameter-elevation Regression on Independent Slopes Model (PRISM) temperature data.

Are all the data publicly available or are some restricted-access? Data on flood exposure are available without restrictions for individual flooding events. Temperature data and population data are also publicly available.

NCHS mortality data are restricted. To access the NCHS mortality data, applicants must submit a project review form:(https://www.cdc.gov/nchs/data/nvss/nchs-research-review-application.pdf) to nvssrestricteddata@cdc.gov and allow four to six weeks for processing.

Links to data:

  1. National Center for Health Statistics, Mortality Data
  2. Global Flood Database and Global Human Settlement Layer (downloaded together)
  3. Parameter-elevation Regression on Independent Slopes Mode

Coding Language:  R 

Tools and Packages used:

R: acs, BiocManager, dlnm, dplyr, ecm, Epi, fiftystater, foreign, fst, ggpubr, ggplot2, graph, graticule, haven, here, janitor, lubridate, mapproj, maptools, mapview, MetBrewer, pipeR, raster, RColorBrewer, readxl, rgdal, rgeos, rnaturalearth, rnaturalearthdata, scales, sf, sp, sqldf, survival, splines, table1, tidycensus, tidyverse, totalcensus, usmap, zipcodeR, zoo, INLA, Rgraphviz, fmesher

Output(s): Exploratory data analysis of flood and mortality data (maps, figures, tables), and output of statistical analysis (figures, tables)

Spatial extent: United States

Temporal extent: 2001-2018

Published papers that use this code: Lynch, Victoria D., et al. “Large floods drive changes in cause-specific mortality in the United States.” Nature Medicine 31.2 (2025): 663-671. doi: https://doi.org/10.1038/s41591-024-03358-z

Continue reading

Processing NDVI and VIIRS vegetation data for use in population health research

Processing NDVI and VIIRS vegetation data for use in population health research 

Link to code

Click here

Prepared by: Alex Mikulas, PhD, CACHE postdoctoral associate 

Date: March 2026 


Original Authors: 

Finn Roberts, IPUMS Senior Data Analyst 

Rebecca Luttinen, IPUMS Global Health Data Analyst 

Devon Kristiansen, IPUMS Global Health Research Manager 

Jude Mikal, Senior Research Fellow, University of Minnesota College of Pharmacy 

Specific purpose of code:  The below code resources offer a comprehensive outline for downloading, processing, aggregating, and integrating global vegetation coverage data for use in demographic and health research. Vegetation data come from the Normalized Difference Vegetation Index (NDVI), the Visible Infrared Imaging Radiometer Suite (VIIRS), and Moderate Resolution Imaging Spectroradiometer (MODUS) 

Ultimately, these resources allow users to aggregate environmental data into spatially relevant scales and integrate it into a variety of social and health data sources to better measure environmental context or exposure.  

The IPUMS DHS Spatial Analysis and Health Research Hub has numerous resources on using environmental data in health research. While many resources in the hub are used with DHS data integration, the data, code, and analysis resources can be altered for data integration into any spatially identified aging and health dataset.  

Link to code:  

General Application: This code and associated resources allow researchers to build a vegetation coverage dataset that can be integrated into any individual or aggregate dataset that has temporal and spatial specificity. The data extend from 1981 to current, with 10 to 20-day increments and up to 20-meter raster resolution. 

How does or could this code allow researchers to assess research questions related to aging or life course?: This code can be used to create environmental context and exposure to greenspace and vegetation variables that can be used cross-sectionally or longitudinally, and at spatially detailed scales. It can be integrated into health surveys to provide environmental context, aggregated data to identify locations with high concentrations of aging adults and changing vegetation or greenspace, etc. In longitudinal datasets, researchers could chart an individual’s longitudinal exposure to vegetation and other relevant environmental features over the life course.  

Data sets used:  

Publicly available climate and weather data.  

Links to data (also repeated in code examples):  

Coding Language: R 

Tools and Packages usedterra, sf, dplyr, ggplot2, ggspatial, patchwork, lubridate (likely others) 

Output(s): datasets, maps 

Spatial extent: Global dataset, raster data at 20 – 250-meter resolution 

Temporal extent: 1981 to current; 10-20 day increments 

Published papers that use this code:  

Moisa, M., Roba, Z., Purohit, S., Deribew, K., & Gemeda, D. (2025). Evaluating the impact of land use and land cover change on soil moisture variability using GIS and remote sensing technology in southwestern Ethiopia. Environmental Monitoring and Assessment, 197. https://doi.org/10.1007/s10661-025-14301-1 

Grace, K., Kristiansen, D., Boyle, E. H., & Luetke, M. (2023). Investigating Seasonal Agriculture, Contraceptive Use, and Pregnancy in Burkina Faso. The Professional Geographer. https://www.tandfonline.com/doi/full/10.1080/00330124.2023.2199316 

 

Continue reading

Resources and Data from the IPUMS DHS Spatial Analysis and Health Research Hub 

Resources and Data from the IPUMS DHS Spatial Analysis and Health Research Hub 

Link to data

Click here

Prepared by: Alex Mikulas, PhD, CACHE postdoctoral associate 

Date: March 2026 


Author: The IPUMS DHS Spatial Analysis and Health Research Hub is designed to be a resource for researchers who are familiar with IPUMS DHS population health survey data but new to weather, environment, and disaster research that uses spatial data sources. Such resources include conceptual frameworks for environment/health research, introductions to datasets, spatial data processing, and analysis code. The code and data resources in the hub use R scripting to demonstrate basic spatial data processing techniques for integrating numerous environmental and weather-related data with social and health data in the DHS.  

IPUMS Demographic and Health Surveys (IPUMS-DHS) is a database of thousands of consistently coded variables on the health and well-being of men, women, children, and births of randomly selected households in 42 African countries and 9 Asian countries. Data include records of all household members, effectively capturing social and demographic data across the life course and age groups for low- and middle-income countries.   

The guides and resources in the IPUMS-DHS Spatial Analysis and Health Research Hub are oriented toward data integration with IPUMS-DHS data. However, many scripts can be applied to any other social, health, and aging datasets that have geographic data identifiers. This includes datasets that have variables for administrative geographies (unique identifiers or spatial data polygons), respondent address or lat/long variables, or other gridded and raster datasets examining aging and health. 

To support such research, the IPUMS Global Health team received a 2023 supplemental grant from the Eunice Kennedy Shriver National Institute of Child Health and Human Development, or NICHD (3R01HD069471-12S1). 

Data are available on: Hub resources include data integration walkthroughs, dataset explanations, spatial data processes, and more. Datasets referenced include CHIRPS, CHIRTS, NDVI, VIIRS, and more. See a sample of the numerous resources below: 

Citation: IPUMS. (2026, March 16). Supporting Research on Extreme Weather and Health. IPUMS DHS Spatial Analysis and Health Research Hub. https://tech.popdata.org/dhs-research-hub/about.html 

Continue reading

HRS Workshop

Workshops

Thank you to everyone who participated in the October 2025 Health and Retirement Study (HRS) Workshop! 

About the workshop: Climate change is influencing human health and is particularly challenging for older adults. HRS holds tremendous potential to facilitate important research on aging-health-environment. This 1.5-day workshop introduced the HRS and reviewed examples of environmental data that can be integrated for this research.

View the full HRS Workshop agenda, with links to slide decks for each day’s overview and presentations. You can also view and download our speakers’ insightful presentations, listed alphabetically below.

Sara Adar, University of Michigan: EPOCH and the Gateway to Global Aging

Jennifer Alshire, University of Southern California
Contextual & Environmental Data: Resources for HRS and Other Aging Surveys

Deborah Balk, Mara Sheftel, Jennifer Brite, and Na Yin, City University New York
Scorching Circumstances: The Role of Extreme Heat in Disability Among Older Workers in Heat Sensitive Jobs

Zhirui Chen, Boston College
Connections among individual- and community-level housing characteristics and disaster preparedness in a national sample of low income U.S. adults

Eun Young Choi, University of Southern California
Aging under Climate Stress How Extreme Temperatures Shape Multi-System Biological Aging

Yanjun Dong, University at Albany
Aging, Climate, and the Social Determinants of Health: Disaster Preparedness and Inequities Among Older Adults

Jessica Finlay, University of Colorado Boulder
Contexts of Cognitive Health in the HRS

Melanie Gall, Arizona State University: Spatial Hazard Event & Loss Database for the US (SHELDUS)

Carina Gronlund, University of Michigan: Weather Resources for HRS in the Gateway to Global and NaNDA

Frank W. Heiland, City University New York
Retirement and Family Demography in the Wake of Disasters

Hannah Malak, UC Santa Barbara
Heat Exposure among Older Adults by Race/ethnicity: a multi-scale investigation of thermal inequity

Xi Pan, Texas State
Environment and Cognitive Aging

Fernando Riosmena, University of Texas – San Antonio
Cumulative Disadvantage & and the Aging of Mexican Immigrants in the United States

Hugh Roland, Alabama Birmingham
Climate Disaster Health Vulnerability Implications of Gulf Coast Demographic Dynamics

Amanda Sonnega, University of Michigan
HRS Overview

Jenna Tipaldo, City University New York
Mortality among disaster-exposed older adults in the US Health and Retirement Study

Roger Wong, State University of New York Upstate Medical University
Age Differences in Climate Event Exposures in a National U.S. Sample

Continue reading

Code analyzing population pyramids for counties exposed to Low Elevation Coastal Zones (LECZs) in Puerto Rico

Code analyzing population pyramids for counties exposed to Low Elevation Coastal Zones (LECZs) in Puerto Rico 

Link to code (Quarto markdown version)

Click here

Link to code (Github Pages Version)

Click here

Date: December 2025


Authors/Creators/ Team Members:  Deborah Balk, Kytt MacManus, Hieu Tran, Camilla Greene, Shemontee Chowdhury, Juan F. Martinez 

Specific purpose of code: Integration of Python programming with ArcGIS API to access NASA SEDAC Low Elevation Coastal Zone (LECZ) data, IPUMS API to access U.S. Census Decennial Census data of Age and Sex groups at the Block Group and County levels, create interactive maps, find insights about the changes in population pyramid structures, and compare these changes between areas inside and outside of the Low Elevation Coastal Zone (LECZ) in Puerto Rico. 

General Application: This lesson demonstrates how to link U.S. Census data with the LECZ Merit-DEM dataset to analyze population and housing changes. It explores regional and local trends (at the county and block group levels) to highlight shifts in age groups within and outside of Low Elevation Coastal Zones (LECZ). The accompanying code enables users to explore census data at multiple geographic scales and integrate spatial environmental data to identify cohorts vulnerable to coastal flooding and observe how populations are changing in these areas. 

How does or could this code allow researchers to assess research questions related  to aging or life course?: This code could be used with the Decennial data to assess any 5-year age groups from under 5 to 85+ years of age and generate population pyramid charts for 2010 and 2020 to assess changes in age groups over time and space. 

Data sets used: 

  • Population, socioeconomic, or health data: Decennial Census Data on Age/Sex, Occupancy Status (Vacancy), Social Vulnerabilities in Community Resilience Estimates (CRE)
  • Climate, weather, disaster or environment data: Low Elevation Coastal Zone (LECZ)

Are all the data publicly available or are some restricted-access? Community Resilience Estimates (CRE). Author spoke with personnel at U.S Census regarding the restrictions and were advised to refer users to the first question on Community Resilience Estimates Frequently Asked Questions. Potential researchers are able to access the data with an approved project through the Federal Statistical Research Data Centers. If researchers would like to go that route, reach out to (sehsd.cre@census.gov) or refer to Federal Statistical Research Data Centers.  

Links to data: Community Resilience Estimates, Decennial Census of Population and Housing Data, Low Elevation Coastal Zones derived from MERIT-DEM – Overview

Coding Language: Python 

Tools and Packages used: Quarto Markdown, GitHub, Pandas, Numpy, Matplotlib, ipumspy, arcgis, matplotlib, folium. 

Output(s): Maps, Scatterplot matrix, population pyramids, summary tables 

Spatial extent: Puerto Rico 

Temporal extent: 2010-2020 

Continue reading

Triply robust approach to evaluate the health impacts of extreme weather events

Triply robust approach to evaluate the health impacts of extreme weather events

Link to code

Click here

Date: October 2025


Authors/Creators/ Team Members:  Lingzhi Chu, Kai Chen

Specific purpose of code: This code is designed to evaluate the relationships between extreme weather events and health outcomes.

General Application: The code was first designed to evaluate the mortality risk associated with flood in the contiguous United States (https://doi.org/10.1038/s41467-025-58236-0). The code could be used with other “pulse” events (e.g., extreme weather events) or other health outcomes (e.g., hospital visits).

How does or could this code allow researchers to assess research questions related  to aging or life course?: This code could be used for any specific age group or subsets by age.

Data sets used: 

  • Population, socioeconomic, or health data: Mortality data from CDC National Center for Health Statistics.
  • Climate, weather, disaster or environment data: NOAA Storm Events Database.

Are all the data publicly available or are some restricted-access? NOAA Storm Events Database is publicly available. The monthly county-level cause specific mortality data are protected and are not publicly available due to data privacy laws but can be requested from the National Center for Health Statistics (https:// www.cdc.gov/nchs/index.htm).

Links to data: https://github.com/CHENlab-Yale/Flood_mortality_US

Coding Language:  R 

Tools and Packages used:  N/A

Output(s): https:// doi.org/10.1038/s41467-025-58236-0

Spatial extent: No restriction

Temporal extent: No restriction

Published papers that use this code:

Chu, Lingzhi, Joshua L. Warren, Erica S. Spatz, Sarah Lowe, Yuan Lu, Xiaomei Ma, Joseph S. Ross, Harlan M. Krumholz, and Kai Chen. “Floods and cause-specific mortality in the United States applying a triply robust approach.” Nature Communications 16, no. 1 (2025): 2853.

DOI: https://doi.org/10.1038/s41467-025-58236-0

Continue reading

Aging under Climate Stress: How Extreme Temperatures Shape Multi-System Biological Aging

Aging under Climate Stress: How Extreme Temperatures Shape Multi-System Biological Aging

Investigators:

Eun Young Choi, Jennifer A. Ailshire, and Eileen M. Crimmins

Funding:

NIA R61AG086854 (CACHE)

Data sources:

  • Health and Retirement Study (HRS). This project utilizes sensitive and restricted data from the HRS. Sensitive data include biomarker measures used to calculate our primary outcome, biological age, drawn from the 2016 Venous Blood Study and the 2014 and 2016 Biomarker data files. Restricted data include respondents’ geographic identifiers and contextual data from the HRS Contextual Data Resource products. Researchers must apply for access to these datasets through the HRS website [https://hrs.isr.umich.edu].
  • GridMet. Daily meteorological variables (e.g., temperature, humidity, wind speed) for the contiguous US [https://climatologylab.org/gridmet.html].
  • Environmental Protection Agency. Daily concentrations of PM₂.₅ and ozone (O₃) across the contiguous US [https://epa.gov/hesc/rsig-related-downloadable-data-files].
  • Centers for Disease Control and Prevention. Social Vulnerability Index that ranks US census tracts based on 15 social factors that may adversely affect communities that encounter disasters [https://atsdr.cdc.gov/place-health/php/svi/index.html].
  • National Neighborhood Data Archive US census tract-level cooling/warming amenities (e.g., public buildings and private low-cost businesses) and national land cover database [https://nanda.isr.umich.edu].

Measures:

  • Health Measures: Biological age is estimated using the “Expanded Biological Age” measure, based on 22 clinically relevant blood-based biomarkers. This measure captures functioning across physiological systems (e.g., cardiovascular, metabolic, renal, immune). Biological age is further regressed on chronological age, and the residual is used as an indicator of biological age acceleration (value > 0) or deceleration (< 0), expressed in years.
  • Aging Measures: The analysis includes US adults aged 56 years and older. We include chronological age a covariate in the model.
  • Climate Measures: Number of extreme heat and cold days; more details below.
  • Source of Susceptibility Measures: We draw on multiple datasets to capture potential socioeconomic factors contributing to susceptibility. At the community level, variables include the Social Vulnerability Index, neighborhood social capital, availability of cooling and warming amenities, and land cover characteristics. At the familial or individual level, measures include household income and wealth, educational attainment, personal social networks, health-related behaviors, and housing type and physical conditions.

Project Summary:

Extreme heat and cold are increasingly associated with morbidity and mortality in aging populations. However, little is known about how these exposures affect biological aging, an important process that precedes the onset of chronic diseases and functional decline. The physiological burden of temperature extremes may not manifest immediately as clinical conditions but instead silently accelerate biological deterioration. Thus, examining biological aging and system-specific damage at an intervenable stage holds substantial public health significance for mitigating long-term health risks from climate stress. Animal studies provide strong evidence that heat and cold stress induce biological decline linked to aging. However, it remains unclear whether these well-characterized biological impacts of heat and cold in model systems translate to human populations. Existing human studies are often restricted to small, regionally selective samples, lacking sociodemographic and geographic representation.

This project is among the first to examine how outdoor extreme temperatures are associated with biological age acceleration measured with 22 biomarkers. Leveraging data from the nationally representative sample of US community-dwelling older adults, we examine whether older adults living in areas with more extreme heat or cold days have greater biological age acceleration and identify which physiological systems are most affected. We assess short-term exposures over the 7 days prior to blood collection for acute biological responses and long-term exposures over a 10-year period for the effects of chronic climatic conditions on biological age acceleration. To disentangle system-specific effects, we also test associations across each biomarker reflecting components (e.g., cardiovascular, immune) of biological age. Further, we identify high-risk subgroups through a novel neural network–based model (regression-guided neural networks; ReGNN [https://doi.org/10.48550/arXiv.2409.13205]) developed by our team. ReGNN complements traditional regression by capturing complex interactions across multiple sources of heterogeneity and generates individual-level susceptibility scores.

On the creation of the weather variables:

Outdoor temperature exposure is measured using the Heat Index and Wind Chill Index that incorporate air temperature with humidity or wind speed, respectively, to reflect temperature as experienced by the human body. Because no single epidemiologic threshold is universally accepted, we adopt a multi-pronged approach to define an “extreme heat and cold.” First, we apply absolute thresholds from the National Weather Service: for heat, ≥80°F (Caution), ≥90°F (Extreme Caution), ≥103°F (Danger); for cold, ≤0°F (Mild), ≤-10°F (Moderate), ≤-25°F (Severe). Second, we apply relative thresholds to account for physiological adaptations and regional acclimatization. Extreme days will be defined as those exceeding the 90th, 95th, 99th percentile (heat) or below the 10th, 5th, 1st percentile (cold) of tract-specific historical index values (1979-1999), constrained to biologically relevant thresholds (Heat Index ≥ 79°F; Wind Chill ≤20°F). For each participant, we calculate the annual mean number of extreme days for two periods: (1) short-term exposure, from the day of blood collection (BC-day) to prior 7 days; (2) long-term exposure, from BC-day to prior 10 years. Residential mobility will be accounted for using cross-wave respondents’ census tract identifiers verified from biennial HRS surveys and self-reported moving month/year data.

Outputs:

Conference presentations, peer-reviewed publications, grant proposals, documented codes to integrate HRS datasets relevant to this work.

Continue reading