Supporting the Case for Wastewater and Environmental Surveillance in Africa with Global Health Estimates data

Author

Carley Truyens

Published

December 11, 2025

Introduction

Wastewater and environmental surveillance (WES) is a powerful public health tool that helps us gain insight into the the health of communities. By systematically, collecting, testing, and analysing wastewater, faecal sludge, and contaminated surface water samples, we can better understand disease trends in the community that contributes to those samples. WES is more efficient than clinical disease surveillance, because a single sample can represent hundreds to thousands of people. It is not biased by access to healthcare, healthcare-seeking behavior, or symptoms. For these reasons, WES is particularly well-suited to low-resource settings, where access to healthcare and financial resources are limited. The goal of the analysis is to better understand the leading causes of disease burden in Africa that can be monitored using WES.

Methods

The data for this analysis was obtained from the World Health Organization (WHO) Global Health Estimates (GHE) for 2021 (World Health Organization 2024). First, the Africa region top-20 summary table was saved as a .csv file and used to identify which WES-identifiable causes of disease to include in the analysis. Of the top 20, the top 10 DALYs (disability adjusted life years) by cause for Africa were evaluated for suitability, based on a literature search.

Next, the GHE 2021 Summary Tables by country for all ages were cleaned up in Excel to remove all countries and causes that were not relevant to this project. The single-sex data was also removed. I then transposed columns to rows in Excel to get country-level disease burdens for each of the selected causes. Lastly, the cells in the GHE 2021 Summary Tables spreadsheet were color-coded to denote data quality and completeness. I created a new variable called data_quality, where I assigned a quality level to each country’s data, according to the color. Lastly, I saved the resulting data as a .csv file. (Note: I completed some of these steps in Excel prior to learning how to do them in R.)

Load packages and import data

I first loaded the packages needed for the analysis, then imported the datasets from .csv files.

Code
library(tidyverse)
library(ggthemes)
library(here)
library(gt)
Code
ghe_afr <- read_csv(here::here("data/raw/ghe2021_daly_bycountry_2021_afr.csv"))
ghe_top20_afr <- read_csv(here::here("data/raw/ghe_top20_DALYs_AFR_2021.csv"))
#FacilityTypes_2024 <- read_csv(here::here("data/raw/JMP_AFR_san_FacilityTypes_2024.csv"))
#ServiceLevels_2024 <- read_csv(here::here("data/raw/JMP_AFR_san_ServiceLevels_2024.csv"))
#FacilityTypes_2021 <- read_csv(here::here("data/raw/JMP_AFR_san_FacilityTypes_2021.csv"))
#ServiceLevels_2021 <- read_csv(here::here("data/raw/JMP_AFR_san_ServiceLevels_2021.csv"))

Data Tidying

Once the data was imported, I pivoted the country-level disease burden data to ensure that each observation contained only one “cause” and one measurement of disease burden “DALYs”.

Code
ghe_afr_long <- ghe_afr |>
  pivot_longer(cols = all_cause:covid19,
               names_to = "cause",
               values_to = "DALYs")

Results

To select WES-identifiable causes of disease to include in the analysis, the top 10 DALYs by cause for Africa, shown in Table 1 below, were considered. A literature search was conducted to identify which of the top 10 causes could be tracked, at least partly, using WES.

Code
ghe_top20_afr |>
  slice_head(n = 11) |>
  gt() |>
    tab_footnote(footnote = md("Source: Global Health Estimates 2021: Disease burden by Cause, Age, Sex, by Country and by Region, 2000-2021. Geneva, World Health Organization; 2024.")) |>
  tab_style(
    style = cell_text(color = "red"), # Changing the color to red
    locations = cells_body(rows = c(2, 3, 5, 6, 8, 9)))
Table 1: Top 10 causes of DALYs in Africa, 2021
Rank Cause DALYs (000s) % DALY DALYs per 100,000
0 All Causes 599504 100.0 50871.2
1 Lower respiratory infections 51910 8.7 4404.8
2 Malaria 50101 8.4 4251.3
3 Preterm birth complications 36729 6.1 3116.6
4 Diarrhoeal diseases 36503 6.1 3097.5
5 HIV/AIDS 26945 4.5 2286.4
6 Birth asphyxia and birth trauma 26762 4.5 2270.9
7 Tuberculosis 25901 4.3 2197.9
8 COVID-19 17566 2.9 1490.6
9 Stroke 15207 2.5 1290.4
10 Road injury 13915 2.3 1180.8
Source: Global Health Estimates 2021: Disease burden by Cause, Age, Sex, by Country and by Region, 2000-2021. Geneva, World Health Organization; 2024.

Of the top 10 causes, the following were selected (also shown in red in Table 1), based on evidence from the literature that the pathogens that are at least partly responsible for these diseases can be detected in wastewater, faecal sludge, and/or surface water samples:

Together, these 6 causes were responsible for nearly 35% of all DALYs in Africa in 2021. See Table 2 and Figure 1 for a summary of DALYs from these 6 causes in African countries.

Code
#removed the all_cause cause from summary statistics
ghe_afr_long_filtered <- ghe_afr_long |>
  filter(cause != "all_cause") 

ghe_afr_long_filtered |>
  group_by(cause) |>
  summarise(min = min(DALYs),
            max = max(DALYs),
            mean = mean(DALYs),
            median = median(DALYs),
            sd = sd(DALYs)) |>
  gt() |>
  fmt_number(decimals = 2) |>
  cols_label(
    cause = "Cause",
    min = "Minimum",
    max = "Maximum",
    mean = "Mean",
    median = "Median",
    sd = "Std Deviation")
Table 2: Summary statistics for DALYs by cause in Africa, 2021
Cause Minimum Maximum Mean Median Std Deviation
covid19 1.97 4,758.44 470.60 211.03 904.91
dd 0.19 12,834.97 716.73 273.21 1,795.12
hiv_aids 0.01 3,697.89 503.94 169.13 798.87
lri 1.31 15,724.62 1,045.22 473.36 2,253.53
malaria 0.00 16,901.82 941.62 253.74 2,457.95
tb 0.06 6,857.44 498.38 162.28 1,052.48
Code
ggplot(data = ghe_afr_long_filtered,
       mapping = aes(x = cause,
                     y = DALYs,
                     fill = cause)) +
  geom_boxplot() +
  theme_minimal() +
  theme(legend.position = "none") +
  coord_cartesian(ylim = c(0, 1500)) + 
  labs(x = "Cause of DALYs",
            y = "DALYs (000s)") +
  theme(axis.title.x = element_text(vjust = -1))
Figure 1: DALYs by cause in Africa, 2021

Data transformation

I added variables to the ghe_afr dataset to better understand what percentage of all DALYs in a country could be attributed to causes that could be at least partially monitored using WES. This is data is plotted in a bar graph in Figure 2, shown below.

Code
ghe_afr_expanded <- ghe_afr |>
  mutate(causes_WES = tb + hiv_aids + dd + malaria + lri + covid19)

ghe_afr_expanded2 <- ghe_afr_expanded |>
  mutate(causes_WES_percent = 100 * causes_WES / all_cause)
Code
ggplot(data = ghe_afr_expanded2,
      aes(x = reorder(country, causes_WES_percent),
          y = causes_WES_percent,
          fill = country)) +
  geom_col(position = position_dodge(width = 0.8)) +
  theme_minimal() +
    theme(legend.position = "none") +
  theme(axis.text.x = element_text(angle=90, vjust=.5, hjust=1)) +
    labs(x = "Country", y = "Percentage of DALYs")
Figure 2: Percentage of DALYs attributed to causes that could be monitored using WES, by country

Data quality

The GHE data is rated by WHO for quality and completeness. Of the 54 countries in Africa, 49 have GHE data as poor, as shown in Figure 3 below. Using WES data to supplement the GHE data could give countries more information on which to base health policy.

Code
qual_levels <- c("very low",
                 "low",
                 "medium",
                 "high")

data_quality_levels <- ghe_afr |>
  mutate(data_quality = factor(data_quality, levels = qual_levels))
Code
ggplot(data = data_quality_levels,
       mapping = aes(x = data_quality,
                     fill = data_quality)) +
  geom_bar() +
  theme_minimal() +
  theme(legend.position = "none") +
  geom_text(
    stat = "count",
    aes(label = after_stat(count)),
        vjust = -0.5) +
  data_quality_levels + labs(x = "Data Quality", y = "# of countries")
Figure 3: Quality of the GHE data, as rated by WHO

Conclusions

  • WES has the potential to help identify communities where infectious diseases are present or increasing (as relevant), so that ministries of health can more efficiently target those communities with public health interventions.

  • Countries with a higher percentage of DALYs attributed to causes that could be monitored using WES may achieve more impact from WES data.

  • Using WES data to supplement the GHE data, which is of poor quality in the vast majority of African countries, could give ministries of health more information on which to base health policy.

References

Alshehri, Balghsim, Olivia N. Birch, and Justin C. Greaves. 2025. “Monitoring Multiple Sexually Transmitted Pathogens Through Wastewater Surveillance.” Pathogens 14 (6): 562. https://doi.org/10.3390/pathogens14060562.
Barnes, Kayla G., Joshua I. Levy, Jillian Gauld, Jonathan Rigby, Oscar Kanjerwa, Christopher B. Uzzell, Chisomo Chilupsya, et al. 2023. “Utilizing River and Wastewater as a SARS-CoV-2 Surveillance Tool in Settings with Limited Formal Sewage Systems.” Nature Communications 14 (1): 7883. https://doi.org/10.1038/s41467-023-43047-y.
Diamond, Megan B., Elizabeth Yee, Manisha Bhinge, and Samuel V. Scarpino. 2023. “Wastewater Surveillance Facilitates Climate Change–Resilient Pathogen Monitoring.” Science Translational Medicine 15 (718): eadi7831. https://doi.org/10.1126/scitranslmed.adi7831.
Huang, Yue, Nan Zhou, Shihan Zhang, Youqin Yi, Ying Han, Minqi Liu, Yue Han, et al. 2022. “Norovirus Detection in Wastewater and Its Correlation with Human Gastroenteritis: A Systematic Review and Meta-Analysis.” Environmental Science and Pollution Research 29 (16): 22829–42. https://doi.org/10.1007/s11356-021-18202-x.
Jensen, K. Erik. 1954. Presence and Destruction of Tubercle Bacilli in Sewage.” Bulletin of the World Health Organization 10 (2): 171–79.
Maposa, Ms Sibonginkosi, Dr Mukhlid Yousif, Ms Chenoa Sankar, Mr Victor Vusi Mabasa, Ms Nosihle Msomi, Mr Emmanuel Phalane, Mr Sipho Gwala, et al. 2025. “Establishment of a Wastewater-Based Surveillance Network to Support Infectious Disease Surveillance in South Africa.” International Journal of Infectious Diseases, Abstracts from the International Congress on Infectious Diseases 2024, 152 (March): 107381. https://doi.org/10.1016/j.ijid.2024.107381.
Medema, Gertjan, Leo Heijnen, Goffe Elsinga, Ronald Italiaander, and Anke Brouwer. 2020. “Presence of SARS-Coronavirus-2 RNA in Sewage and Correlation with Reported COVID-19 Prevalence in the Early Stage of the Epidemic in The Netherlands.” Environmental Science & Technology Letters 7 (7): 511–16. https://doi.org/10.1021/acs.estlett.0c00357.
Mtetwa, Hlengiwe N., Isaac D. Amoah, Sheena Kumari, Faizal Bux, and Poovendhree Reddy. 2022. “Molecular Surveillance of Tuberculosis-Causing Mycobacteria in Wastewater.” Heliyon 8 (2). https://doi.org/10.1016/j.heliyon.2022.e08910.
———. 2023. “Surveillance of Multidrug-Resistant Tuberculosis in Sub-Saharan Africa Through Wastewater-Based Epidemiology.” Heliyon 9 (8). https://doi.org/10.1016/j.heliyon.2023.e18302.
Reboud, Julien, Gaolian Xu, Alice Garrett, Moses Adriko, Zhugen Yang, Edridah M. Tukahebwa, Candia Rowell, and Jonathan M. Cooper. 2019. “Paper-Based Microfluidics for DNA Diagnostics of Malaria in Low Resource Underserved Rural Communities.” Proceedings of the National Academy of Sciences 116 (11): 4834–42. https://doi.org/10.1073/pnas.1812296116.
Saasa, Ngonda, Ethel M’kandawire, Joseph Ndebe, Mulenga Mwenda, Fred Chimpukutu, Andrew Nalishuwa Mukubesa, Fred Njobvu, et al. 2024. “Detection of Human Adenovirus and Rotavirus in Wastewater in Lusaka, Zambia: Demonstrating the Utility of Environmental Surveillance for the Community.” Pathogens 13 (6): 486. https://doi.org/10.3390/pathogens13060486.
Tiwari, Ananda, Taru Miller, Vito Baraka, Marc Christian Tahita, Vivi Maketa, Bérenger Kaboré, Paul Tunde Kingpriest, et al. 2025. “Strengthening Pathogen and Antimicrobial Resistance Surveillance Through Environmental Monitoring in Sub-Saharan Africa: Stakeholder Perspectives.” International Journal of Hygiene and Environmental Health 270 (September): 114651. https://doi.org/10.1016/j.ijheh.2025.114651.
Tubatsi, Gosaitse, and Lemme P. Kebaabetswe. 2022. “Detection of Enteric Viruses from Wastewater and River Water in Botswana.” Food and Environmental Virology 14 (2): 157–69. https://doi.org/10.1007/s12560-022-09513-4.
Wolfe, Marlene K., Meri R. J. Varkila, Alessandro Zulli, Julie Parsonnet, and Alexandria B. Boehm. 2024. “Detection and Quantification of Human Immunodeficiency Virus-1 (HIV-1) Total Nucleic Acids in Wastewater Settled Solids from Two California Communities.” Applied and Environmental Microbiology 90 (12): e01477–24. https://doi.org/10.1128/aem.01477-24.
World Health Organization. 2024. “Global Health Estimates 2021 Summary Tables: DALYs by Cause, Age and Sex, by WHO Region 2000-2021.” Geneva.

Reuse