Skip to main content

Introducing the Temperature Extremes in Europe (TEE) Datasets

 Introducing the Temperature Extremes in Europe (TEE) Datasets

Link to code

Click here

Date: November 2025


Authors/Creators/ Team Members: Sara R. Ronnkvist, Zoe Haskell-Craig, Risto Conte Keviabu, Abbie Robinson, Mathew E. Hauer, Domenico Bovienzo, Emilio Zagheni

Specific purpose of code: This GitHub repository contains a collection of scripts that generate datasets quantifying extreme temperature exposure in Europe using a variety of metrics at two sub-national spatial scales (NUTS 2 and NUTS 3) and three temporal scales (daily, extreme temperature wave, and yearly) from 1980-2024. These datasets capture the breadth of temperature metrics used in epidemiology, demography and environmental literature with 67 different metrics: including regionally-unusual temperature events (defined as temperatures above/below the 95th/5th percentile of historical temperatures) and periods of sustained (consecutive day) exposure to extreme temperatures. Additionally, these scripts can be adapted to construct temperature extremes for other geographic regions or scales with a few minor revisions. 

General Application: The TEE datasets can be linked to any data that contains NUTS identifiers(e.g.Eurostat)using a simplemergeto study the impacts of extreme temperatures on populations.

How does or could this code allow researchers to assess research questions related  to aging or life course?: The TEE datasets provide temperature data in a user-friendly format which can easily be linked to EuroStat or other datasets with NUTS identifiers. Additionally, our code is reproducible and easily adaptable to other geographic regions and/or time frames. Researchers who wish to study other regions may adapt our code to construct extreme temperature measures.

Data sets used: 

Are all the data publicly available or are some restricted-access? Publicly available.

Links to data: TEE dataset are available on FigSharehttps://springernature.figshare.com/articles/dataset/Temperature_Extremes_in_Europe_TEE_/28063226

Coding Language: We use Python to download the extreme temperature data from Copernicus and aggregate the hourly data to daily temperature measures. We use R for everything else.

Tools and Packages used:

R: terra, exactextractr, sf, tidyverse, foreign, zoo 

Python: os, glob, shutil, zipfile, cdsapi,  xarray, numpy 

Output(s):

16 datasets: 

  • Annual measures of extreme temperature 
  • tee_yearly_nuts2.csv 
  • tee_yearly_nuts3.csv 
  • Extreme temperature waves (consecutive days of temperature extremes) 
  • tee_wave_nuts2.csv 
  • tee_wave_nuts3.csv 
  • Daily temperature measure datasets split into 10-year files to reduce file size 
  • tee_daily_nuts2_[START YEAR]_[END YEAR].csv 
  • tee_daily_nuts3_[START YEAR]_[END YEAR].csv 

Spatial extent: European Union countries

Temporal extent: 1980-2024 

Comments: Ronnkvist et al (2025) contains detailed information on how the datasets were constructed and the included metrics. We provide replication instructions in the TEE-dataset GitHub repository (https://github.com/haskellcraigz/TEE-dataset/tree/main). 

Citation: 

Ronnkvist, S.R., Haskell-Craig, Z., Robinson, A., Conte Keivabu, R., Hauer, M.E., Bovienzo, D., & Zagheni, E. (2025) What’s the TEE: Metrics of Temperature Extremes in Europe NUTS Regions (1980-2024). Scientific Datahttps://doi.org/10.1038/s41597-025-05352-7 

Published papers that use this code:

Scientific Data publication associated with the TEE datasets: 

Ronnkvist, S.R., Haskell-Craig, Z., Robinson, A., Conte Keivabu, R., Hauer, M.E., Bovienzo, D., & Zagheni, E. (2025) What’s the TEE: Metrics of Temperature Extremes in Europe NUTS Regions (1980-2024). Scientific Datahttps://doi.org/10.1038/s41597-025-05352-7 

Continue reading

Aligning the Exposome and Health Equity with Brain Aging amidst a Changing Climate  

What am I reading? Aligning the Exposome and Health Equity with Brain Aging amidst a Changing Climate  

Link to article

If you’re interested in contributing a short What Am I Reading post, we’d love to hear from you! Email us at cache@colorado.edu

Written by Kelly Perry and Jenna Merenstein

When people think about brain health and aging, they might first think about the role of genetics or lifestyle factors. Both factors are important, though a growing body of research shows something else is also playing a major role in shaping how our brains age: the physical, social, and structural exposures we experience across our lives—from before birth through older age. Scientists refer to this as the exposome, and understanding it is key to addressing cognitive decline in a changing climate (Li et al 2025). In our recent Alzheimer’s & Dementia Perspective, we argued that understanding both healthy aging and Alzheimer’s disease and related dementias (ADRDs) requires an exposome- and equity-centered lens that considers the environmental and social conditions (including systemic inequities and systems-level resilience strategies) that shape brain health. We include the framework introduced in our paper in Figure 1. 

 

Figure 1. An equity- and justice-centered framework for linking the exposome and neurocognitive health across the life course (from Perry KE & Merenstein J, 2026) 

A recent Nature Medicine paper by Legaz et al. titled “The Exposome of Brain Aging across 34 Countries,” provides a compelling example of how this framework can be applied in practice. The authors examined 73 physical and social exposomal factors (e.g., air pollution, temperature, greenspace, gender equality, democracy, and access to drinking water) and their relation to neuroimaging measures of brain structure and brain function for 18,701 participants that varied in cognitive status (cognitively normal, mild cognitive impairment, Alzheimer’s disease, and frontotemporal lobar degeneration). Participants were recruited from 34 countries across Latin America, North America, Europe, Asia, Africa, and Oceania, thereby helping address the systemic underrepresentation of low- and middle-income countries (LMICs) in neuroimaging research.  

Legaz et al. demonstrated that aggregated exposome models explained significantly more variance in brain aging than any single exposure alone—up to 15.5 times more than individual factors. Furthermore, their results highlight the impact that structural inequities embedded within neighborhoods, institutions, and political systems have on brain aging: physical exposures (e.g., higher air pollution, reduced greenspace, and extreme temperatures) were strongly associated with accelerated structural brain aging. In contrast, social exposures (e.g., poverty, weaker rule of law, and decreased civic participation) were associated with accelerated functional brain aging. The latter was shown to be especially true for female participants in the study, where reduced rights-related factors and poor soil and water quality predicted accelerated aging in females more than males. Their findings underscore a central argument of our Perspective: that environmental neuroscience needs to overcome the siloed approaches that are traditional hallmarks of the discipline, e.g., focusing on a single exposure or a single high-income country cohort. 

Tools such as the Area Deprivation Index (ADI) and the Neighborhood Atlas are helpful in elucidating these structural, systemic inequities that adversely impact brain aging and increase the risk of ADRDs. While Legaz et al. used country-level indicators, neighborhood-level metrics such as ADI can provide finer resolution for studying how disadvantage shapes brain health within countries and across communities. ADI captures social dimensions such as income, education, employment, and housing quality, allowing researchers to better characterize cumulative disadvantage at the local scale. Similarly, the Neighborhood Atlas (which hosts the ADI) provides standardized geospatial measures of neighborhood deprivation that can be linked to neuroimaging, cognition, and dementia outcomes (Hunt et al 2020Kim et al 2024). These tools help operationalize the “social exposome” by connecting place-based inequities to measurable differences in brain structure and function. 

Building better data in this field also means improving access to neuroimaging itself. As we discussed in our Perspective, one major barrier to exposome- and equity-informed neuroscience is that advanced MRI remains inaccessible in many under-resourced and rural settings. Emerging low-field MRI initiatives, including recent work at Cardiff University, are helping address this gap by developing lower-cost, portable systems capable of whole-brain imaging. These efforts are particularly important for LMICs, where the burden of environmental exposures is often highest but access to imaging infrastructure is lowest. Legaz et al.’s multi-country study underscores why this matters: without broader imaging access, we risk building evidence only from the least exposed populations.  

Protecting brain health equitably requires shifting toward prevention and systems change. Policies that promote cleaner air and water, safer housing, and equitable urban design can reduce harmful exposures across the life course. Investments in expanding access to green spaces, pollution control, and community infrastructure benefit physical health and cognitive resilience. Legaz et al.’s study provides strong empirical support for the notion that environmental and social conditions fundamentally shape brain aging outcomes. As climate change intensifies these exposures globally, building inclusive datasets (e.g., The Neighborhood Atlas), improving access to neuroimaging (e.g., investing in low-field scanners), and centering health equity in brain health research and policy will be essential for ensuring healthy cognitive aging for communities worldwide. 

Continue reading

Mapping Flooding and Population Exposure: The Global Flood Database  

Mapping Flooding and Population Exposure: The Global Flood Database  

Link to data

Click here

Prepared by: Kathryn Foster, Cornell University 

Date: April 2026


Original Authors: B. Tellman, J.A Sullivan, C.S Doyle, C. Kuhn, A.J Kettner, G.R Brakenridge, T.A. Erikson, D.A. Slayback 

About: The data combines satellite flood and human settlement data to identify population exposure to flooding and flood risk from 2000 onward. The data were created using NASA satellite imagery to identify and map flood events recorded by the Dartmouth Flood Observatory (DFO), which were then intersected with global watershed data (HydroSHEDS) and daily precipitation estimates (PERSIANN-CDR). The maps were then overlaid with population data from the Global Human Settlement Layer (GHSL) to derive population exposure estimates.

Data are available on: Flood events, population exposure per flooding event, population exposed per country-event, and population displaced per event. The data include 913 flood events from 169 countries from 2000 to 2018. Utilizing satellite data, this database covers 2.23 million square kilometers of inundated land and helps track flood exposure, accounting for population change in flood-prone regions.

Spatial data (GeoTIFF files) by country and flooding events can be downloaded from the bottom of the Global Flood Database website in the interactive maps section. These maps outline the flooded areas and the duration of the flood for each event, as well as impacts on displacement and casualties. The DFO estimates both the cause of the flood event and the casualties. This is the number reported by the media or the government and could be much higher or lower than the estimated number of people exposed from satellite data. 

Population exposure per flooding event is calculated by intersecting the observed inundated flood data with the population data. The population exposed per event is reported using the GHSL population estimated in 2000, and in 2015. Tabular data in csv format can be found in the “About the data” link at the top of the Global Flood Database website.  

Although the database does not separate population exposure by age for each flooding event, researchers can still study older adults using the data. For example, researchers may combine the Global Flood Data with Census or population data of interest to assess the relationship between flood exposure and older adults (example found here).  

Detailed GIS data descriptions and methodological notes can be found here: https://storage.googleapis.com/gfd_metadata/README_GFD.pdf 

Citation: Tellman, B. et al. Satellite imaging reveals increased proportion of population exposed to floods. Nature596, 80–86 (2021). 

 

Continue reading

Creating “bins” for extreme temperature data and adjusting for known bias

Creating “bins” for extreme temperature data and adjusting for known bias

Link to code

Click here

Prepared by: Alex Mikulas, PhD, CACHE postdoctoral associate 

Date: April 2026


Original Authors:

Benjamin Jones, Northwestern University 

Jacob Moscona, Massachusetts Institute of Technology 

Benjamin A. Olken, Massachusetts Institute of Technology  

Cristine von Dessauer, Massachusetts Institute of Technology 

Specific purpose of code: This Stata code and program take temperature data at fine temporal and spatial resolution (ie: tract-day) and transform it to an aggregated panel dataset with temperature binned to year-place or month-place specificity. The program calculates both realized and expected number of days in each temperature bin. The data are then ready for use in regression and similar analysis using standard temperature bin specifications. The data are useful in better capturing the role of extreme temperatures by identifying extreme temperature days outside of what would be expected in a given area and are useful for assessing the causal role of increasing extreme temperature exposure.

General Application: Extreme temperature exposure is often operationalized in research using a binning procedure, wherein a researcher aggregates the number of extreme temperature days into binned temperature ranges. A common and serious bias can occur when using binned temperature data over time if the outcome variable of interest is associated with the baseline temperature, producing what are often called “U-shaped” results. When studying the impacts of climate change and increasing extreme temperature exposure, common binning procedures neglect the baseline temperature of a given area.

Let’s say that over a 20-year span, the average temperature increase is uniform across space. A place like Phoenix will see a large increase in extreme heat days, (say, 90-degree+ days) while a place like Boston will see a smaller increase in extreme heat days. This is because the baseline temperature for Phoenix 20 years ago was much warmer than the baseline temperature in Boston.

The bias arises when a given outcome is associated with both extreme heat days, and also the baseline temperature of a given area. If a baseline temperature of an area is closer to the bin thresholds for “extreme heat”, the outcome may be associated with the baseline temperature as well as the increase in extreme heat days, introducing statistical bias into an analysis.

This code addresses the issue by providing the “expected” number of days in each temperature bin, as well as the observed number of days in each temperature bin. These data can then control for different baseline temperatures and trends in warming for different areas. These expected and observed temperature bins allow the researcher to avoid regressing “trends on trends”, which estimate the biased U-shaped results. The estimated area-year and area-month temperature bin data can be used in a wide variety of extreme temperature exposure studies.

How does or could this code allow researchers to assess research questions related  to aging or life course?: This code can be applied to spatially and temporally specific temperature datasets to bin temperature exposure data while also accounting for varying baseline temperatures and varying trends in extreme temperature over time. Health and aging scholars can then integrate these data as a weather exposure variable and more accurately predict the impact of extreme temperature on age and health related outcomes.

Data sets used: 

  • Population, socioeconomic, or health data: code generates binned temperature data that can be integrated with any data that has spatial and temporal specificity (such as lat/long, geographic administrative identifiers, panel data, or data with observation dates).
  • Climate, weather, disaster or environment data: code is applicable to any temperature data with detailed place/day resolution.

Are all the data publicly available or are some restricted-access? NA

Links to data: NA

Coding Language: Stata 

Tools and Packages used: cftemp, a custom Stata command in an ado. file.

Output(s): Dataset of observed and counterfactual data of binned counts of extreme temperature days at the place/year or place/month level. Output comes from any user supplied, hyper-specific daily temperature datasets. 

Spatial extent: Flexible. Depends on user supplied data.

Temporal extent: Flexible. Depends on user supplied data.

Published papers that use this code: Benjamin Jones, Jacob Moscona, Benjamin A. Olken, and Cristine von Dessauer, “With or Without U? Binning Bias and the Causal Effects of Temperature Extremes,” NBER Working Paper 34671 (2026), https://doi.org/10.3386/w34671. 

Continue reading

Code linking NCHS mortality data with GFD flood event data

Code linking NCHS mortality data with GFD flood event data

Link to code

Click here

Date: April 2026


Authors/Creators/ Team Members: Victoria D. Lynch, Jonathan A. Sullivan, Aaron B. Flores, Xicheng Xie, Sarika Aggarwal, Rachel C. Nethery, Marianthi-Anna Kioumourtzoglou, Anne E. Nigra, and Robbie M. Parks

Specific purpose of code: This code links National Center for Health Statistics (NCHS) mortality data with Global Flood Database (GFD) flood event data from 2001 – 2018 by US county. We used the NCHS data to identify monthly total and cause-specific deaths by age group, sex, and county and used Global Human Settlement Layer (GHSL) population data to calculate county-level flood exposure and mortality rates. We used a Bayesian formulation of the conditional quasi-Poisson model to analyze the county-level association between the number of flood events per month and monthly death rates, accounting for overdispersion in the mortality data. The conditional approach examines differences within matched strata (here, county-months) like a case-crossover study design, which removes confounding bias due to factors that vary across strata. Bayesian inference enables the ‘borrowing of information’ across county units and for the full distributional estimation of the parameters of interest.
All-cause and cause-specific mortality associated with flood events is likely differential by flood cause and severity; therefore, we conducted analyses separately by all-cause and cause-specific mortality (cancers, cardiovascular diseases, infectious and parasitic diseases, injuries, neuropsychiatric conditions, and respiratory diseases), flood cause (all floods, heavy rain, tropical cyclone, snowmelt, and ice jam or dam break), and flood severity (mild, moderate, severe, and very severe). Because very severe flood events were most strongly associated with increased mortality across all flood causes and mortality groups, we further analyzed associations stratified by age group (0-64 and 65+ years) and sex (female and male) for very severe floods only.

General Application: This code links county-level flood exposure, categorized by flood type, with county-level mortality rates for the six primary causes of death in the US: cancers, cardiovascular disease, infectious and parasitic diseases, injuries, neuropsychiatric conditions, and respiratory diseases. The code could be used with any county-level health outcome and, with modification, with other county-level environmental exposures. The code specifically categorized exposure by flood type and severity, which would not apply to other exposures.

How does or could this code allow researchers to assess research questions related  to aging or life course?: The code is written to assess the association between flood exposure and mortality by age category; in our paper, we specifically stratified by age category (0-64 years old and 65+ years old) to examine flood exposure-related mortality among older adults. The NCHS data include individual-level age at death and would enable analyses with any subset of age groups.

Data sets used: 

  • Population, socioeconomic, or health data:
    National Center for Health Statistics (NCHS) mortality data; Global Human Settlement Layer (GHSL) population exposure.
  • Climate, weather, disaster or environment data:
    Global Flood Database (GFD) flood event data; Dartmouth Flood Observatory (DFO) flood classification data; Parameter-elevation Regression on Independent Slopes Model (PRISM) temperature data.

Are all the data publicly available or are some restricted-access? Data on flood exposure are available without restrictions for individual flooding events. Temperature data and population data are also publicly available.

NCHS mortality data are restricted. To access the NCHS mortality data, applicants must submit a project review form:(https://www.cdc.gov/nchs/data/nvss/nchs-research-review-application.pdf) to nvssrestricteddata@cdc.gov and allow four to six weeks for processing.

Links to data:

  1. National Center for Health Statistics, Mortality Data
  2. Global Flood Database and Global Human Settlement Layer (downloaded together)
  3. Parameter-elevation Regression on Independent Slopes Mode

Coding Language:  R 

Tools and Packages used:

R: acs, BiocManager, dlnm, dplyr, ecm, Epi, fiftystater, foreign, fst, ggpubr, ggplot2, graph, graticule, haven, here, janitor, lubridate, mapproj, maptools, mapview, MetBrewer, pipeR, raster, RColorBrewer, readxl, rgdal, rgeos, rnaturalearth, rnaturalearthdata, scales, sf, sp, sqldf, survival, splines, table1, tidycensus, tidyverse, totalcensus, usmap, zipcodeR, zoo, INLA, Rgraphviz, fmesher

Output(s): Exploratory data analysis of flood and mortality data (maps, figures, tables), and output of statistical analysis (figures, tables)

Spatial extent: United States

Temporal extent: 2001-2018

Published papers that use this code: Lynch, Victoria D., et al. “Large floods drive changes in cause-specific mortality in the United States.” Nature Medicine 31.2 (2025): 663-671. doi: https://doi.org/10.1038/s41591-024-03358-z

Continue reading

What am I Watching? CAFE University: Considerations for temperature in public health studies Auto Draft

What am I Watching? CAFE University: Considerations for temperature in public health studies 

Link to webinar

If you’re interested in contributing a short What Am I Reading post, we’d love to hear from you! Email us at cache@colorado.edu.

Written by Jenna Tipaldo and Deborah Balk, CUNY Institute for Demographic Research

Researchers interested in studying the impacts of temperature on health face decisions on how to operationalize and measure exposures. This webinar “CAFE University: Considerations for temperature in public health studies” by Lauren Mock and Shreya Nalluri explores the myriad choices required when using temperature data to study health outcomes, from choice of metric and dataset to the study design.  

The webinar includes an overview of various temperature metrics including land surface temperature (LST), air temperature (AT), heat index, wet bulb globe temperature (WGBT), and universal thermal comfort index (UTCI). A fuller discussion  about choice of metrics can be found in this resource from the World Resources Institute (Engel et al. 2025), also referenced in the webinar. (It is useful to note that these are all measures of outdoor temperature. Measurement of indoor vs outdoor exposure is explored in another post on air pollution). 

Each metric is more suited for specific purposes and contexts than for others, and each has drawbacks. For example, air temperature is widely available but doesn’t include factors that influence how humans perceive temperature, such as humidity. Both WGBT and UTCI do consider humidity.  

Other dataset considerations include the availability of multiple metrics such as daily maximums, minimums, or averages. (All of the metrics discussed in the webinar provide daily temperature resolution, but some sensors have more fine-grain temporal data, such as hourly. See Table below.)  

Additionally, researchers may choose to use raw temperature values versus standardized values such as percentiles, which can account for acclimatization.  

Source: CAFE University 

Another aspect of operationalizing exposure is considering the timing of exposure relative to an outcome of interest – the webinar gives an example of defining the exposure period as the day of, day prior to, or three days prior to an event.  

Even so, a recent paper (Cruz et al., 2025) proposes a framing of heat as a chronic phenomenon and argues that chronic exposures pose different risks that are not captured by existing research that evaluates the health impacts of acute exposures to extreme heat. 

The webinar also explores the use of several gridded temperature datasets, including ERA5, gridMET, and PRISM, and how to link these with health data (e.g., calculating zonal statistics and population-weighting).  

The webinar provides strengths, limitations, and use cases for each dataset, which vary in their spatial and temporal resolution, as well as the measurements that can be calculated using each dataset. For example, gridMET and PRISM have high-resolution daily averages but do not have the additional measurement components to calculate UTCI or WBGT. ERA5 has higher resolution temporally with hourly data but is coarser spatially and provides global coverage with components to calculate more metrics, including UTCI and WBGT. Though, the ERA5-Land data is not well-suited for coastal studies.  

In general, gridded data are also limited by the data underlying them, a point that was touched on but not explored in depth in the webinar.  It was noted that “gridded products inherit uncertainty from models and stations,” and these limitations are important to keep in mind when choosing a dataset and measures.   

Regardless of which dataset is used, another consideration is how the resolution of the temperature data compares to the resolution of the health or socioeconomic data or outcome of interest. Using spatial resolution as an example, when a researcher wants to integrate temperature data to an administrative unit (say, county or municipality) associated with health records or survey respondents’ residence, they may care whether the temperature grid data in question is more or less coarse than those administrative units. At mismatched spatial scales, there may be implications for errors in measurement or attribution when averaging temperature into the administrative units. As an example, the figure below shows gridMET and ERA5 data for the same day (July 31, 2025) in the Philadelphia, PA and surrounding NJ area.  Averaging across finer resolution data would likely be more suitable for finer-scale (smaller) units such as administrative units that represent urban areas and seen in the center of the figure below.  

Finer scale data may also be more suitable when within administrative unit variation in temperature is high, or the analyst wants to capture that variation. For coarser administrative units, if additional information such as where population is concentrated is not available, use of the coarser temperature data may be as suitable as finer-grained data, noting of course that any sub-unit variation over these administrative areas is simply unobserved (Porter & Howell, 2016) or was reallocated with ancillary data (Zoraghein and Leyk, 2018, Uhl et al. 2018).  

In the context of gridded population data, Leyk et al. (2019) provide a review of products available and a guide for understanding underlying data used in each gridded product, including the population data (spatial coarseness and how change over time is handled), use of ancillary data (such as settlements, roads or land cover), and methods used to allocate population within a defined grid “cell size”. All of these impact the suitability of use in particular applications. Fitness-for-use guidelines for these global population gridded dataset may depend on spatial and temporal resolution, scale and setting (rural vs urban), and mechanism of interest as noted in Leyk et al. (2019) and the same principles can act as a lens when evaluating which climate data set is most suitable for a given analysis.  

Materials from a CACHE demonstration project “Heat, Disability in older adults and Care” from El Colegio de Mexico provides code and guidance for calculating UTCI using ERA5 data, producing municipality-level estimates of number of severe heat days:   https://agingclimatehealth.org/severe-heat-days-using-the-universal-thermal-comfort-index/  

 

References 

Cruz, M., Mach, K. J., Turek-Hankins, L. L., Ashad-Bishop, K. C., Bailey, Z. D., Evans, S. D., Fanning, A., Fernandez-Burgos, M., Gilbert, J., Howard, B., Mahabir, M., Marturano, J., Murphy, L. N., Muse, N., Pérodin, J., & Clement, A. C. (2025). Where heat does not come in waves: A framework for understanding and managing chronic heat. Environmental Research: Climate4(2), 023002. https://doi.org/10.1088/2752-5295/adc827 

 

Engel, R. E. Mackres, M. Palmieri and E. Anzilotti (2025). Beyond the Thermometer: 5 Heat Metrics That Drive Better Decision-Making, World Resources Institute Insights March 17, 2025 https://www.wri.org/insights/beyond-thermometer-measuring-heat  

 

Leyk, S., Gaughan, A. E., Adamo, S. B., De Sherbinin, A., Balk, D., Freire, S., Rose, A., Stevens, F. R., Blankespoor, B., Frye, C., Comenetz, J., Sorichetta, A., MacManus, K., Pistolesi, L., Levy, M., Tatem, A. J., & Pesaresi, M. (2019). The spatial allocation of population: A review of large-scale gridded population data products and their fitness for use. Earth System Science Data, 11(3), 1385–1409. https://doi.org/10.5194/essd-11-1385-2019 

 

Porter, J. R., & Howell, F. M. (2016). A spatial decomposition of county population growth in the United States: Population redistribution in the rural-to-urban continuum, 1980–2010. In Recapturing space: New middle-range theory in spatial demography (pp. 175-198). Cham: Springer International Publishing. 

Uhl, J. H., Zoraghein, H., Leyk, S., Balk, D., Corbane, C., Syrris, V., & Florczyk, A. J. (2020). Exposing the urban continuum: implications and cross-comparison from an interdisciplinary perspective. International Journal of Digital Earth13(1), 22–44. https://doi.org/10.1080/17538947.2018.1550120  

 

Zoraghein, H., & Leyk, S. (2018). Enhancing areal interpolation frameworks through dasymetric refinement to create consistent population estimates across censuses. International Journal of Geographical Information Science32(10), 1948–1976. https://doi.org/10.1080/13658816.2018.1472267  

 

Continue reading

What am I reading: Sunny-day Floods and their Health Risks  

What am I reading? Sunny-day Floods and their Health Risks

Link to article

If you’re interested in contributing a short What Am I Reading post, we’d love to hear from you! Email us at cache@colorado.edu.

Written by Kathryn Foster, Cornell University 

As sea-level rise worsens the impacts of hurricanes and storm surges, it also leads to more frequent tidal and seasonal floods in coastal areas, commonly referred to as sunny-day, blue sky, or nuisance flooding. The annual number of days with sunny-day flooding has more than doubled since 2000, with projections that it will triple by 2050, averaging 45-85 days a year nationwide (NOAA, 2025). Current research, such as Mueller et al. (2024), is beginning to document the many impacts of sunny-day flooding and other types of flooding on residential health.  

Mortality: Research shows that in Florida, a 20-mm (0.79 in.) increase in tidal flooding depth raises mortality rates by 0.46% to 0.60% among adults 65 and older. Sea-level rise could contribute to an additional 130 elderly deaths annually in Florida relative to 2019 (Mueller et al. 2024).  

Similarly, longer-lasting floods have greater impacts on mortality than other types of floods, such as flash floods (Lynch et al., 2025). An increase in frequency and severity of seasonal tidal flooding could increase the risk of mortality by blocking roads to medical services, such as regular doctor’s appointments, pharmacies, and hospitals. Other research supports these conclusions, finding that with anticipated intensified flooding, elderly populations will become more vulnerable to morbidity and mortality risks, largely due to mobility constraints among aging adults (Paavola, 2017; Hu et al., 2018; Sheahan et al., 2025). The mortality effects of flooding primarily affect those who require at least 8.85 minutes to reach the nearest hospital (Mueller et al. 2024).  

Infectious Diseases: Seasonal tidal flooding may also increase the incidence of waterborne and infectious diseases in the United States. As flooding increases, communities are exposed to standing water around their homes for longer periods. Seasonal floods are strongly associated with increased hospitalizations for Legionnaire’s disease, a type of pneumonia caused by Legionella bacteria that grows in warm, moist climates (Lynch and Shaman 2022). Enteric infections may also arise from drinking water contamination and sewerage disruption (Carr et al., 2024; Wright et al., 2018), and mosquito-borne diseases and infections may occur from exposure to polluted flood waters (Wright et al., 2018).  

Previous research on coastal storms finds that these events are associated with the spread of certain infectious diseases, such as E. Coli, Legionnaires’, Cryptosporidiosis, Paratyphoid fever, and Dengue in certain areas (Lynch and Shaman 2023; Zheng et al., 2017). These studies similarly posit that the spread of infectious disease will increase alongside predicted increases in coastal storms due to climate change. Although not yet explicitly studied, researchers hypothesize that the spread of infectious diseases associated with coastal storms will also be linked to seasonal tidal flooding (Lynch and Shaman 2022). This provides ground for future research to examine how increasing tidal floods relate to the rise in infectious diseases, like that of coastal storms.   

Other Impacts: Furthermore, other hazards, such as snakebites and wound infections, may occur with tidal flooding and prolonged inundation of residential areas (Wright et al., 2018). Flooding increases residents’ exposure to venomous snakes, as snakes try to enter homes to find dry land or as water snakes are found in inundated streets, leading to a potential health hazard if bitten (Ochoa et al., 2018). Furthermore, as residents document having to walk through flooded streets during seasonal tidal flooding, wounds may become infected by the floodwater, or unseen hazards in the water may cause wounds (Wright et al., 2018). While these hazards are beginning to be documented, little research has examined their relationship to tidal flooding; future research could fill this gap by examining the association between reported snake bites and wound infections and tidal flooding.  

From Research to Policy: These studies highlight that the costs of climate change go beyond heat impacts, emphasizing that the increased frequency and severity of seasonal tidal flooding may directly affect residents’ health. Mueller and colleagues (2024) suggest that communities design programs to improve transportation options for the elderly during the sunny-day flooding season. Furthermore, other researchers suggest that public health initiatives should inform clinicians and residents alike about the health risks of seasonal flooding (Lynch and Shaman, 2022). Alongside public-health initiatives, more effective and equitable preparations for flood risk should be put in place to better mitigate seasonal flooding-related health risks, especially for elderly adults in these communities (Lynch et al., 2025). Such initiatives could include increasing flood-risk awareness by hosting community classes on flood preparation, stockpiling essential medical supplies (Yodsubana and Nuntaboot, 2021), coordinating resource and information sharing with government agencies, healthcare providers, and community groups for elderly residents (Madani Hosseini et al., 2024), and constructing seawalls and elevating roads to prevent road closures (Mueller et al., 2024).   

References:

Hu, P., Zhang, Q., Shi, P., Chen, B., & Fang, J. (2018). Flood-induced mortality across the globe: Spatiotemporal pattern and influencing factors. Science of the Total Environment, 643, 171-182. 

Lynch, V. D., & Shaman, J. (2022). The effect of seasonal and extreme floods on hospitalizations for Legionnaires’ disease in the United States, 2000–2011. BMC Infectious Diseases, 22(1), 550. 

Lynch, V. D., & Shaman, J. (2023). Waterborne infectious diseases associated with exposure to tropical cyclonic storms, United States, 1996–2018. Emerging infectious diseases, 29(8), 1548. 

Lynch, V. D., Sullivan, J. A., Flores, A. B., Xie, X., Aggarwal, S., Nethery, R. C., … & Parks, R. M. (2025). Large floods drive changes in cause-specific mortality in the United States. Nature Medicine, 31(2), 663-671.  

Madani Hosseini, M., Zargoush, M., & Ghazalbash, S. (2024). Climate crisis risks to elderly health: strategies for effective promotion and response. Health Promotion International, 39(2), daae031. 

Mahmoudi, S., Moftakhari, H., Muñoz, D. F., Radfar, S., Sweet, W., & Moradkhani, H. (2025). Escalating high tide flooding along the Atlantic and Gulf Coast of the United States due to sea level rise. Earth’s Future, 13(9), e2024EF005328. 

Mueller, V., Hauer, M., & Sheriff, G. (2024). Sunny-day flooding and mortality risk in coastal Florida. Demography, 61(1), 209-230. 

NOAA Office for Coastal Management High tide flooding.. (n.d.). https://coast.noaa.gov/states/fast-facts/recurrent-tidal-flooding.html#:~:text=Here%20are%20some%20statistics%20on%20high%20tide,inches%20(21%20to%2024%20centimeters)%20since%201880  

Paavola, J. (2017). Health impacts of climate change and health and social inequalities in the UK. Environmental Health, 16 (Suppl 1), 113. 

Wright, L. D., D’Elia, C. F., & Nichols, C. R. (2018). Impacts of coastal waters and flooding on human health. In Tomorrow’s Coasts: Complex and Impermanent (pp. 151-166). Cham: Springer International Publishing. 

Yodsuban, P., & Nuntaboot, K. (2021). Community-based flood disaster management for older adults in southern of Thailand: A qualitative study. International journal of nursing sciences, 8(4), 409-417. 

Zheng, J., Han, W., Jiang, B., Ma, W., & Zhang, Y. (2017). Infectious diseases and tropical cyclones in Southeast China. International journal of environmental research and public health, 14(5), 494. 

Continue reading

Processing NDVI and VIIRS vegetation data for use in population health research

Processing NDVI and VIIRS vegetation data for use in population health research 

Link to code

Click here

Prepared by: Alex Mikulas, PhD, CACHE postdoctoral associate 

Date: March 2026 


Original Authors: 

Finn Roberts, IPUMS Senior Data Analyst 

Rebecca Luttinen, IPUMS Global Health Data Analyst 

Devon Kristiansen, IPUMS Global Health Research Manager 

Jude Mikal, Senior Research Fellow, University of Minnesota College of Pharmacy 

Specific purpose of code:  The below code resources offer a comprehensive outline for downloading, processing, aggregating, and integrating global vegetation coverage data for use in demographic and health research. Vegetation data come from the Normalized Difference Vegetation Index (NDVI), the Visible Infrared Imaging Radiometer Suite (VIIRS), and Moderate Resolution Imaging Spectroradiometer (MODUS) 

Ultimately, these resources allow users to aggregate environmental data into spatially relevant scales and integrate it into a variety of social and health data sources to better measure environmental context or exposure.  

The IPUMS DHS Spatial Analysis and Health Research Hub has numerous resources on using environmental data in health research. While many resources in the hub are used with DHS data integration, the data, code, and analysis resources can be altered for data integration into any spatially identified aging and health dataset.  

Link to code:  

General Application: This code and associated resources allow researchers to build a vegetation coverage dataset that can be integrated into any individual or aggregate dataset that has temporal and spatial specificity. The data extend from 1981 to current, with 10 to 20-day increments and up to 20-meter raster resolution. 

How does or could this code allow researchers to assess research questions related to aging or life course?: This code can be used to create environmental context and exposure to greenspace and vegetation variables that can be used cross-sectionally or longitudinally, and at spatially detailed scales. It can be integrated into health surveys to provide environmental context, aggregated data to identify locations with high concentrations of aging adults and changing vegetation or greenspace, etc. In longitudinal datasets, researchers could chart an individual’s longitudinal exposure to vegetation and other relevant environmental features over the life course.  

Data sets used:  

Publicly available climate and weather data.  

Links to data (also repeated in code examples):  

Coding Language: R 

Tools and Packages usedterra, sf, dplyr, ggplot2, ggspatial, patchwork, lubridate (likely others) 

Output(s): datasets, maps 

Spatial extent: Global dataset, raster data at 20 – 250-meter resolution 

Temporal extent: 1981 to current; 10-20 day increments 

Published papers that use this code:  

Moisa, M., Roba, Z., Purohit, S., Deribew, K., & Gemeda, D. (2025). Evaluating the impact of land use and land cover change on soil moisture variability using GIS and remote sensing technology in southwestern Ethiopia. Environmental Monitoring and Assessment, 197. https://doi.org/10.1007/s10661-025-14301-1 

Grace, K., Kristiansen, D., Boyle, E. H., & Luetke, M. (2023). Investigating Seasonal Agriculture, Contraceptive Use, and Pregnancy in Burkina Faso. The Professional Geographer. https://www.tandfonline.com/doi/full/10.1080/00330124.2023.2199316 

 

Continue reading

Resources and Data from the IPUMS DHS Spatial Analysis and Health Research Hub 

Resources and Data from the IPUMS DHS Spatial Analysis and Health Research Hub 

Link to data

Click here

Prepared by: Alex Mikulas, PhD, CACHE postdoctoral associate 

Date: March 2026 


Author: The IPUMS DHS Spatial Analysis and Health Research Hub is designed to be a resource for researchers who are familiar with IPUMS DHS population health survey data but new to weather, environment, and disaster research that uses spatial data sources. Such resources include conceptual frameworks for environment/health research, introductions to datasets, spatial data processing, and analysis code. The code and data resources in the hub use R scripting to demonstrate basic spatial data processing techniques for integrating numerous environmental and weather-related data with social and health data in the DHS.  

IPUMS Demographic and Health Surveys (IPUMS-DHS) is a database of thousands of consistently coded variables on the health and well-being of men, women, children, and births of randomly selected households in 42 African countries and 9 Asian countries. Data include records of all household members, effectively capturing social and demographic data across the life course and age groups for low- and middle-income countries.   

The guides and resources in the IPUMS-DHS Spatial Analysis and Health Research Hub are oriented toward data integration with IPUMS-DHS data. However, many scripts can be applied to any other social, health, and aging datasets that have geographic data identifiers. This includes datasets that have variables for administrative geographies (unique identifiers or spatial data polygons), respondent address or lat/long variables, or other gridded and raster datasets examining aging and health. 

To support such research, the IPUMS Global Health team received a 2023 supplemental grant from the Eunice Kennedy Shriver National Institute of Child Health and Human Development, or NICHD (3R01HD069471-12S1). 

Data are available on: Hub resources include data integration walkthroughs, dataset explanations, spatial data processes, and more. Datasets referenced include CHIRPS, CHIRTS, NDVI, VIIRS, and more. See a sample of the numerous resources below: 

Citation: IPUMS. (2026, March 16). Supporting Research on Extreme Weather and Health. IPUMS DHS Spatial Analysis and Health Research Hub. https://tech.popdata.org/dhs-research-hub/about.html 

Continue reading

Uncovering the Exposome: A Pilot Study of Aging and Environmental Exposure in Malawi 

Uncovering the Exposome: A Pilot Study of Aging and Environmental Exposure in Malawi 

Investigators:

Helene Purcell 

Funding:

CACHE Seed Funding 

Data sources:

  • Long-term individual-level panel data from the Malawi Longitudinal Study of Families and Health (MLSFH), with repeated cognitive and health measures spanning over 15 years;
  • Geocoded administrative and infrastructure data from the National Statistics Office (NSO) of Malawi.
  • Environmental and other Geospatial Information Systems (GIS) datasets from NOAA and other environmental data sources.

Measures:

  • Climate Measures: historic rainfall and drought data
  • Physical Environment: road access, sanitation and water sources
  • Social environment: family structure and social network data
  • Policy environment: exposure to fertilizer subsidies, access to Anti-retroviral therapy (ART) for HIV/AIDS
  • Community services: proximity to health facilities and schools
  • Life experiences: economic shocks, migration history, adverse childhood experiences (ACEs)
  • Longitudinal MLSFH socioeconomic and health data: cognitive assessments, epigenetic clock measurements, other mental and physical health/aging measurements(frailty, activities of daily living (ADLs), blood pressure, etc.)

Project Summary:

In recent years, the exposome, encompassing the totality of environmental, socioeconomic, and health-related exposures throughout one’s life,1 has emerged as a pivotal yet understudied dimension in the understanding of aging trajectories, longevity, and Alzheimer’s Disease and Alzheimer’s Disease-Related Dementias (AD/ADRD) risk, resilience, and disparities. This project addresses a critical gap in global aging research by extending exposome science to a low-income Sub-Saharan African population that faces high climate vulnerability, socioeconomic change, and health system constraints. The overarching goal is to build foundational infrastructure and analytic methods to study how cumulative, multi-domain exposures shape cognitive aging and Alzheimer’s Disease and Alzheimer’s Disease-Related Dementia (AD/ADRD) risk in this context.

This includes developing a modular, geospatially-coded exposome database in Malawi, which will link historical census, environmental/climate data, and other administrative data to longitudinal household data from the Malawi Longitudinal Study of Families and Health (MLSFH) that are precisely geo-coded for integration. By launching the development of this database, we can begin to evaluate the association between exposomal factors and the aging process, with a particular focus on cognitive decline and AD/ADRD, using the Harmonized Cognitive Assessment Protocol (HCAP) survey measures in the MLSFH, forthcoming epigenetic data, and other ADRD risk factors.

Outputs:

Phase 1: Data cleaning, geocoding, aggregation of NSO and MLSFH data; construct first version of exposome database with documentation

Phase 2: Link exposures to longitudinal cognition data and implement analytic strategies to evaluate cumulative and life course effects

Phase 3: Finalize modeling of cognitive decline and indicators for ADRD risk, draft manuscript, and prepare data products for dissemination

References:

[1] Christopher Paul Wild. The exposome: from concept to utility. International Journal of Epidemiology, 41(1):24–32, 01 2012. doi: 10.1093/ije/dyr236.

Continue reading