Supporting the Case for Wastewater and Environmental Surveillance in Africa with Global Health Estimates data

Author

Carley Truyens

Published

December 11, 2025

Introduction

Wastewater and environmental surveillance (WES) is a powerful public health tool that helps us gain insight into the the health of communities. By systematically, collecting, testing, and analysing wastewater, faecal sludge, and contaminated surface water samples, we can better understand disease trends in the community that contributes to those samples. WES is more efficient than clinical disease surveillance, because a single sample can represent hundreds to thousands of people. It is not biased by access to healthcare, healthcare-seeking behavior, or symptoms. For these reasons, WES is particularly well-suited to low-resource settings, where access to healthcare and financial resources are limited. The goal of the analysis is to better understand the leading causes of disease burden in Africa that can be monitored using WES.

Methods

The data for this analysis was obtained from the World Health Organization (WHO) Global Health Estimates (GHE) for 2021 (World Health Organization 2024). First, the Africa region top-20 summary table was saved as a .csv file and used to identify which WES-identifiable causes of disease to include in the analysis. Of the top 20, the top 10 DALYs (disability adjusted life years) by cause for Africa were evaluated for suitability, based on a literature search.

Next, the GHE 2021 Summary Tables by country for all ages were cleaned up in Excel to remove all countries and causes that were not relevant to this project. The single-sex data was also removed. I then transposed columns to rows in Excel to get country-level disease burdens for each of the selected causes. Lastly, the cells in the GHE 2021 Summary Tables spreadsheet were color-coded to denote data quality and completeness. I created a new variable called data_quality, where I assigned a quality level to each country’s data, according to the color. Lastly, I saved the resulting data as a .csv file. (Note: I completed some of these steps in Excel prior to learning how to do them in R.)

Load packages and import data

I first loaded the packages needed for the analysis, then imported the datasets from .csv files.

Code

library(tidyverse)
library(ggthemes)
library(here)
library(gt)

Code

ghe_afr <- read_csv(here::here("data/raw/ghe2021_daly_bycountry_2021_afr.csv"))
ghe_top20_afr <- read_csv(here::here("data/raw/ghe_top20_DALYs_AFR_2021.csv"))
#FacilityTypes_2024 <- read_csv(here::here("data/raw/JMP_AFR_san_FacilityTypes_2024.csv"))
#ServiceLevels_2024 <- read_csv(here::here("data/raw/JMP_AFR_san_ServiceLevels_2024.csv"))
#FacilityTypes_2021 <- read_csv(here::here("data/raw/JMP_AFR_san_FacilityTypes_2021.csv"))
#ServiceLevels_2021 <- read_csv(here::here("data/raw/JMP_AFR_san_ServiceLevels_2021.csv"))

Data Tidying

Once the data was imported, I pivoted the country-level disease burden data to ensure that each observation contained only one “cause” and one measurement of disease burden “DALYs”.

Code

ghe_afr_long <- ghe_afr |>
  pivot_longer(cols = all_cause:covid19,
               names_to = "cause",
               values_to = "DALYs")

Results

To select WES-identifiable causes of disease to include in the analysis, the top 10 DALYs by cause for Africa, shown in Table 1 below, were considered. A literature search was conducted to identify which of the top 10 causes could be tracked, at least partly, using WES.

Code

ghe_top20_afr |>
  slice_head(n = 11) |>
  gt() |>
    tab_footnote(footnote = md("Source: Global Health Estimates 2021: Disease burden by Cause, Age, Sex, by Country and by Region, 2000-2021. Geneva, World Health Organization; 2024.")) |>
  tab_style(
    style = cell_text(color = "red"), # Changing the color to red
    locations = cells_body(rows = c(2, 3, 5, 6, 8, 9)))

Table 1: Top 10 causes of DALYs in Africa, 2021

Rank	Cause	DALYs (000s)	% DALY	DALYs per 100,000
0	All Causes	599504	100.0	50871.2
1	Lower respiratory infections	51910	8.7	4404.8
2	Malaria	50101	8.4	4251.3
3	Preterm birth complications	36729	6.1	3116.6
4	Diarrhoeal diseases	36503	6.1	3097.5
5	HIV/AIDS	26945	4.5	2286.4
6	Birth asphyxia and birth trauma	26762	4.5	2270.9
7	Tuberculosis	25901	4.3	2197.9
8	COVID-19	17566	2.9	1490.6
9	Stroke	15207	2.5	1290.4
10	Road injury	13915	2.3	1180.8
Source: Global Health Estimates 2021: Disease burden by Cause, Age, Sex, by Country and by Region, 2000-2021. Geneva, World Health Organization; 2024.

Of the top 10 causes, the following were selected (also shown in red in Table 1), based on evidence from the literature that the pathogens that are at least partly responsible for these diseases can be detected in wastewater, faecal sludge, and/or surface water samples:

Lower respiratory infections (Tiwari et al. 2025)
Malaria (Diamond et al. 2023; Reboud et al. 2019)
Diarrhoeal disease [Huang et al. (2022); Tubatsi and Kebaabetswe (2022); Saasa et al. (2024); ]
HIV/AIDS (Wolfe et al. 2024; Alshehri, Birch, and Greaves 2025)
Tuberculosis (Jensen 1954; Mtetwa et al. 2022, 2023)
COVID-19 (Medema et al. 2020; Maposa et al. 2025; Barnes et al. 2023)

Together, these 6 causes were responsible for nearly 35% of all DALYs in Africa in 2021. See Table 2 and Figure 1 for a summary of DALYs from these 6 causes in African countries.

Code

#removed the all_cause cause from summary statistics
ghe_afr_long_filtered <- ghe_afr_long |>
  filter(cause != "all_cause") 

ghe_afr_long_filtered |>
  group_by(cause) |>
  summarise(min = min(DALYs),
            max = max(DALYs),
            mean = mean(DALYs),
            median = median(DALYs),
            sd = sd(DALYs)) |>
  gt() |>
  fmt_number(decimals = 2) |>
  cols_label(
    cause = "Cause",
    min = "Minimum",
    max = "Maximum",
    mean = "Mean",
    median = "Median",
    sd = "Std Deviation")

Table 2: Summary statistics for DALYs by cause in Africa, 2021

Cause	Minimum	Maximum	Mean	Median	Std Deviation
covid19	1.97	4,758.44	470.60	211.03	904.91
dd	0.19	12,834.97	716.73	273.21	1,795.12
hiv_aids	0.01	3,697.89	503.94	169.13	798.87
lri	1.31	15,724.62	1,045.22	473.36	2,253.53
malaria	0.00	16,901.82	941.62	253.74	2,457.95
tb	0.06	6,857.44	498.38	162.28	1,052.48

Code

ggplot(data = ghe_afr_long_filtered,
       mapping = aes(x = cause,
                     y = DALYs,
                     fill = cause)) +
  geom_boxplot() +
  theme_minimal() +
  theme(legend.position = "none") +
  coord_cartesian(ylim = c(0, 1500)) + 
  labs(x = "Cause of DALYs",
            y = "DALYs (000s)") +
  theme(axis.title.x = element_text(vjust = -1))

Figure 1: DALYs by cause in Africa, 2021

Data transformation

I added variables to the ghe_afr dataset to better understand what percentage of all DALYs in a country could be attributed to causes that could be at least partially monitored using WES. This is data is plotted in a bar graph in Figure 2, shown below.

Code

ghe_afr_expanded <- ghe_afr |>
  mutate(causes_WES = tb + hiv_aids + dd + malaria + lri + covid19)

ghe_afr_expanded2 <- ghe_afr_expanded |>
  mutate(causes_WES_percent = 100 * causes_WES / all_cause)

Code

ggplot(data = ghe_afr_expanded2,
      aes(x = reorder(country, causes_WES_percent),
          y = causes_WES_percent,
          fill = country)) +
  geom_col(position = position_dodge(width = 0.8)) +
  theme_minimal() +
    theme(legend.position = "none") +
  theme(axis.text.x = element_text(angle=90, vjust=.5, hjust=1)) +
    labs(x = "Country", y = "Percentage of DALYs")

Figure 2: Percentage of DALYs attributed to causes that could be monitored using WES, by country

Data quality

The GHE data is rated by WHO for quality and completeness. Of the 54 countries in Africa, 49 have GHE data as poor, as shown in Figure 3 below. Using WES data to supplement the GHE data could give countries more information on which to base health policy.

Code

qual_levels <- c("very low",
                 "low",
                 "medium",
                 "high")

data_quality_levels <- ghe_afr |>
  mutate(data_quality = factor(data_quality, levels = qual_levels))

Code

ggplot(data = data_quality_levels,
       mapping = aes(x = data_quality,
                     fill = data_quality)) +
  geom_bar() +
  theme_minimal() +
  theme(legend.position = "none") +
  geom_text(
    stat = "count",
    aes(label = after_stat(count)),
        vjust = -0.5) +
  data_quality_levels + labs(x = "Data Quality", y = "# of countries")

Figure 3: Quality of the GHE data, as rated by WHO

Conclusions

WES has the potential to help identify communities where infectious diseases are present or increasing (as relevant), so that ministries of health can more efficiently target those communities with public health interventions.
Countries with a higher percentage of DALYs attributed to causes that could be monitored using WES may achieve more impact from WES data.
Using WES data to supplement the GHE data, which is of poor quality in the vast majority of African countries, could give ministries of health more information on which to base health policy.

References

Alshehri, Balghsim, Olivia N. Birch, and Justin C. Greaves. 2025. “Monitoring Multiple Sexually Transmitted Pathogens Through Wastewater Surveillance.” Pathogens 14 (6): 562. https://doi.org/10.3390/pathogens14060562.

Barnes, Kayla G., Joshua I. Levy, Jillian Gauld, Jonathan Rigby, Oscar Kanjerwa, Christopher B. Uzzell, Chisomo Chilupsya, et al. 2023. “Utilizing River and Wastewater as a SARS-CoV-2 Surveillance Tool in Settings with Limited Formal Sewage Systems.” Nature Communications 14 (1): 7883. https://doi.org/10.1038/s41467-023-43047-y.

Diamond, Megan B., Elizabeth Yee, Manisha Bhinge, and Samuel V. Scarpino. 2023. “Wastewater Surveillance Facilitates Climate Change–Resilient Pathogen Monitoring.” Science Translational Medicine 15 (718): eadi7831. https://doi.org/10.1126/scitranslmed.adi7831.

Huang, Yue, Nan Zhou, Shihan Zhang, Youqin Yi, Ying Han, Minqi Liu, Yue Han, et al. 2022. “Norovirus Detection in Wastewater and Its Correlation with Human Gastroenteritis: A Systematic Review and Meta-Analysis.” Environmental Science and Pollution Research 29 (16): 22829–42. https://doi.org/10.1007/s11356-021-18202-x.

Jensen, K. Erik. 1954. “Presence and Destruction of Tubercle Bacilli in Sewage.” Bulletin of the World Health Organization 10 (2): 171–79.

Maposa, Ms Sibonginkosi, Dr Mukhlid Yousif, Ms Chenoa Sankar, Mr Victor Vusi Mabasa, Ms Nosihle Msomi, Mr Emmanuel Phalane, Mr Sipho Gwala, et al. 2025. “Establishment of a Wastewater-Based Surveillance Network to Support Infectious Disease Surveillance in South Africa.” International Journal of Infectious Diseases, Abstracts from the International Congress on Infectious Diseases 2024, 152 (March): 107381. https://doi.org/10.1016/j.ijid.2024.107381.

Medema, Gertjan, Leo Heijnen, Goffe Elsinga, Ronald Italiaander, and Anke Brouwer. 2020. “Presence of SARS-Coronavirus-2 RNA in Sewage and Correlation with Reported COVID-19 Prevalence in the Early Stage of the Epidemic in The Netherlands.” Environmental Science & Technology Letters 7 (7): 511–16. https://doi.org/10.1021/acs.estlett.0c00357.

Mtetwa, Hlengiwe N., Isaac D. Amoah, Sheena Kumari, Faizal Bux, and Poovendhree Reddy. 2022. “Molecular Surveillance of Tuberculosis-Causing Mycobacteria in Wastewater.” Heliyon 8 (2). https://doi.org/10.1016/j.heliyon.2022.e08910.

———. 2023. “Surveillance of Multidrug-Resistant Tuberculosis in Sub-Saharan Africa Through Wastewater-Based Epidemiology.” Heliyon 9 (8). https://doi.org/10.1016/j.heliyon.2023.e18302.

Reboud, Julien, Gaolian Xu, Alice Garrett, Moses Adriko, Zhugen Yang, Edridah M. Tukahebwa, Candia Rowell, and Jonathan M. Cooper. 2019. “Paper-Based Microfluidics for DNA Diagnostics of Malaria in Low Resource Underserved Rural Communities.” Proceedings of the National Academy of Sciences 116 (11): 4834–42. https://doi.org/10.1073/pnas.1812296116.

Saasa, Ngonda, Ethel M’kandawire, Joseph Ndebe, Mulenga Mwenda, Fred Chimpukutu, Andrew Nalishuwa Mukubesa, Fred Njobvu, et al. 2024. “Detection of Human Adenovirus and Rotavirus in Wastewater in Lusaka, Zambia: Demonstrating the Utility of Environmental Surveillance for the Community.” Pathogens 13 (6): 486. https://doi.org/10.3390/pathogens13060486.

Tiwari, Ananda, Taru Miller, Vito Baraka, Marc Christian Tahita, Vivi Maketa, Bérenger Kaboré, Paul Tunde Kingpriest, et al. 2025. “Strengthening Pathogen and Antimicrobial Resistance Surveillance Through Environmental Monitoring in Sub-Saharan Africa: Stakeholder Perspectives.” International Journal of Hygiene and Environmental Health 270 (September): 114651. https://doi.org/10.1016/j.ijheh.2025.114651.

Tubatsi, Gosaitse, and Lemme P. Kebaabetswe. 2022. “Detection of Enteric Viruses from Wastewater and River Water in Botswana.” Food and Environmental Virology 14 (2): 157–69. https://doi.org/10.1007/s12560-022-09513-4.

Wolfe, Marlene K., Meri R. J. Varkila, Alessandro Zulli, Julie Parsonnet, and Alexandria B. Boehm. 2024. “Detection and Quantification of Human Immunodeficiency Virus-1 (HIV-1) Total Nucleic Acids in Wastewater Settled Solids from Two California Communities.” Applied and Environmental Microbiology 90 (12): e01477–24. https://doi.org/10.1128/aem.01477-24.

World Health Organization. 2024. “Global Health Estimates 2021 Summary Tables: DALYs by Cause, Age and Sex, by WHO Region 2000-2021.” Geneva.

Reuse

CC BY 4.0