The association between different timeframes of air pollution exposure and COVID-19 incidence, morbidity and mortality in German counties in 2020 | Environmental Health
The aim of this study was to analyze the effects of both long- and short-term exposure to NO2 and PM with a diameter of 2.5 μm or less (PM2.5) on COVID-19 disease burden at the county level (German: Kreise) for Germany during the first outbreak of the pandemic in spring 2020.
We conducted an observational county-based study built on the methods and data sources utilized in a study by Koch et al. (2022) [13]. Here we provide an analysis of the effects of long-term exposure, considering 10 years and 2 years period (2010–19 and 2018–19), a period where exposure to air pollutants might evoke chronic diseases. Additionally we estimate the effects of short-term exposure (28 days and 7 days), a typical period within which inflammation responses triggered by air pollutants might occur [10]. Finally, we included the time window were the SARS-CoV-2 virus might survive being attached to particulate matter, which is 48h [12]. We focused on COVID-19 incidence and mortality, but also leverage data from German hospitals to include admission to intensive care units and the need for mechanical ventilation of COVID-19 patients into the analysis.
Ethical approval was obtained from the ethical commission of the Charité (EA2/038/21; head: Prof. Dr. Kaschina). Patient consent was waived, because no individual patient data were collected and data analysis was performed anonymously.
Setting and design
The unit of analysis is German counties, which corresponds to the Nomenclature of Territorial Unit for Statistics level 3 (NUTS-3). Most large cities and some smaller towns constitute their own counties. The first confirmed COVID-19 case in Germany was reported on January 27th. Germany’s Robert-Koch-Institute (RKI) counted 1,916,000 laboratory-confirmed COVID-19 cases and 33,000 deaths in 2020 [14]. To avoid bias in our dataset from COVID-19 spreading events, we limited our analysis to the first COVID-19 outbreak period (March 4th to May 16th) during which social distancing rules were implemented by the federal government and were therefore consistent over the whole country.
On March 15th, schools in Germany and national borders closed, followed by restaurants, shops and churches. Federal states started imposing social distancing rules from March 22nd onwards, limiting meetings between different households to two persons. Some states also restricted residents’ movement outside their homes. By April 15, these rules started to be lifted. Schools reopened on May 4th and borders started to be re-opened from May 15 [15]. Different regions were affected differently by the first wave, with high incidence in the large southern states of Bavaria and Baden-Württemberg and large cities, as well as cluster events during the February carnival festivities in the Rhine region. Many counties in the north and east were comparatively less affected during the first wave.
Data sources
COVID-19 data
The German Interdisciplinary Association for Intensive Care and Emergency Medicine (DIVI) register tracks intensive care capacities and COVID-19 patient numbers in German hospitals [16]. Daily reporting to the register became mandatory for all hospitals on April 16, 2020. Data on COVID-19 patient-days on intensive care units and on mechanical ventilation were extracted for the period between April 16 and May 16, 2020. Using demographic data, we calculated the rate of patient-days per 100,000 residents. The Robert-Koch-Institute (RKI), Germany’s national public health institute, provides a public-access database of COVID-19 cases and deaths reported for each county by local public health offices [14]. All 401 counties in Germany reported cases and deaths from January onwards. However, only 396 counties reported to the DIVI-register and consistent data is only available from April 16th onwards [16]. The primary analysis of all outcomes is therefore limited to the DIVI reporting counties and the period from April 16th to May 16th, when most restrictions on social distancing, shops and schools began to be lifted. However, considering the entire first wave starting on March 4th, the day social restrictions were imposed in most of the country, only 18% of cases and deaths occurred in the shorter period starting on April 16th. Therefore, this longer period, which aligns with the RKI’s definition of the first wave, was used for secondary analysis.
Air pollution data
As in Koch et al. (2022), the APExpose dataset (version 2.0) was used to analyze the association between long-term exposure to air pollution and COVID-19 outcomes [17]. The data combines observed data from the European Environmental Agency’s Airbase database, with modelled global reanalysis data from the Copernicus Atmospheric Monitoring Service (CAMS) to create a complete dataset for all German counties for the period 2010—2019. The data includes parameters for nitrogen dioxide (NO2), nitrogen oxide (NO), ozone (O3), and particulate matter with an aerodynamic diameter smaller than 2.5μm and 10μm (PM2.5 and PM10), as well as three different scenarios (urban, rural, average). The parameters for NO2, NO, PM10 and PM2.5 are given as annual means while O3 is provided as the annual average of daily maximum 8-h. To analyze the effects of long-term exposure to air pollution, we calculated the means of each pollutant in each county over the ten-year period (January 2010 – December 2019) and the two-year-period (2018 – 2019) prior to the COVID-19 outbreak. NO and PM10 are highly correlated with NO2 and PM2.5, respectively, therefore no separate models were included in the main analysis.
To analyze the association of short-term air pollution exposure and COVID-19 outcomes, a new dataset was created, based on the same sources and methodology as APExpose at the daily time resolution. The data contains daily observations for the period from March 4th to May 16th, 2020, with values for NO2, O3 and PM2.5, averaged over the preceding 48-h, 7- day, and 4-week time periods of interest.
Temperature time series for the German counties, averaged at the same time resolutions as those used for the air pollution data, were obtained from the CAMS reanalysis.
Demographic data and German index of social deprivation
The Federal Statistical Office of Germany provides data for each county on population size, area, and population distribution by age group and sex. Data from 2019 was used to calculate population density and the share of the population aged over 64 years, as well as the fraction of the population that is female. Population density is assumed to increase risk of transmission, and male sex and old age have been linked to increased risk of severe outcomes and death from COVID-19 [18, 19]. The German Index of Social Deprivation (GISD), developed by the RKI, is a measure of relative regional socio-economic disadvantage. The GISD indicators are selected to align with the concept of individual socio-economic status (SES) in social epidemiology, which combines education, occupation, and income dimensions. The index score is on a scale from 0 to 1. A higher score indicates more deprivation [20]. For each county, we calculated the mean GISD score between 2010 and 2019. Several ecological studies in Germany and other OECD countries have shown an association between income/social status and COVID-19 incidence. In the first wave of the pandemic, regions with higher income and education experienced higher incidence, possibly due to more international business and leisure travel [21, 22]. Studies found increased risk of mortality for socially deprived regions in Germany starting from the second wave of the pandemic, though findings for the first wave are less conclusive [23, 24]. Studies in the USA and UK have found increased risks for hospitalization and death for patients and regions with greater social deprivation [25,26,27,28].
Statistics
The analysis has four outcome variables: new cases (incidence), new deaths (mortality), patient days on ICUs, and patient-days on mechanical ventilation. All outcomes were calculated as rates per 100,000 residents.
For the two long-term exposures, from 2010 to 2019 (ten years) and 2018 to 2019 (two years), air pollution and COVID-19 disease parameters were calculated as means per county. For short-term exposures, air pollution was calculated to provide averages over the preceding 2, 7 and 28 days for each date in a given county. The main analysis is limited to dates and counties for which data on patient-days on ICUs and mechanical ventilation were available through the DIVI-register, between April 16th and May 16th 2020.
Separate models for mean annual NO2 and mean annual PM2.5 were fitted for the ten- and two-year exposure periods and for the 48-h, 7-days and 4-weeks preceding each date. All models were adjusted to the following confounders: proportion of population aged over 65, the proportion of the population that was female, days between the first reported COVID-19 case and March 1st, population density, and the social depravation index score (Supplement Material Figure S1). Sensitivity analyses were conducted for tri-pollutant models with NO2, PM2.5 and O3 as combined exposures; Short-term models were also adjusted for temperature (daily mean dry temperature), as well as for weekdays only (excluding Saturdays, Sundays and Mondays from the model, since reporting of COVID-19 data from weekends could be delayed until Monday), and for the outcome parameter incidence and mortality the complete time period between March 4th and May 16th 2020 was also evaluated. Separate models were fit for each outcome, pollutant and exposure time-window.
As the annual mean O3 pollution (based on measurements of 8-h daily maximums) exceeded WHO-recommended thresholds in all counties between 2010 and 2019, but remained low during the first outbreak of the pandemic it was therefore excluded from the main analysis.
Negative binomial distributions were chosen due to overdispersion of the outcome variables. Because the control variables in the model operate at different scales (e.g. fractions of the population that are female were measured on a scale from 0 to 1; whereas e.g. population density had a larger range of values), the “scale” function in R was used to standardize them through centering and scaling, in order to improve comparability between variables. Since many counties experienced some days without new cases or deaths or patients on intensive care, the data used to model short-term exposure contained a high proportion of zeros (27 – 95%, varying by outcome). Therefore, zero-inflation was applied. This was not necessary for the long-term exposure models, as only a few counties reported zeros for outcomes aggregated over the entire study period (0 – 14%). To account for the repeat measurements at county-level and unmeasured factors affecting outcomes, random intercepts were fit for each county. Statistical analysis was conducted in R Statistical Software (version 4.3.1). Data processing was conducted with the tidyverse-package and models were fit with the MASS and glmmTMB packages [29,30,31,32].
link