Code
# install packages
install.packages("patchwork")
# load packages
library(tidyverse)
library(ggthemes)
library(here)
library(knitr)
library(gt)
library(patchwork)Although water covers most of the Earth’s surface, only a tiny fraction is available as accessible freshwater for human use. Ensuring safe and reliable drinking water has become a global priority as population growth, environmental pressures, and infrastructural inequalities intensify challenges to water security (Arora and Mishra 2022). To track progress toward universal access, the WHO/UNICEF Joint Monitoring Programme (JMP) compiles global data estimates on drinking water services across countries and over time. This project uses these data to examine trends in drinking water access among Portuguese-speaking countries (CPLP) and to compare inequalities between rural and urban populations.
The data used in this project were obtained from the WHO/UNICEF JMP (https://washdata.org/data). The dataset includes country-level drinking water indicators reported between 2000 and 2024, along with associated demographic and socioeconomic classifications such as SDG region, income group, and population. Data were downloaded in raw Excel format and processed to standardise variable names, reconstruct multi-row headers, and convert the wide-format service-level indicators into a tidy long format. For the purposes of this analysis, the dataset was restricted to CPLP (e.g., at least basic, limited, unimproved, safely managed) and filtered to retain information relevant to drinking water access across rural, urban, and total populations.
# install packages
install.packages("patchwork")
# load packages
library(tidyverse)
library(ggthemes)
library(here)
library(knitr)
library(gt)
library(patchwork)# import processed data
water <- read_csv(here::here("data/processed/water_data_long.csv"))
# filter data for CPLP countries
water_cplp <- water |>
filter(country %in% c("Portugal", "Brazil", "Angola", "Cabo Verde",
"Guinea-Bissau", "Mozambique",
"Sao Tome and Principe", "Timor-Leste"))Figure 1 and Figure 2 illustrate access to “at least basic” and “limited” drinking water services among rural and urban populations in CPLP countries in 2004 and 2024, respectively.
# check the differences in accessing drinking water between rural and urban populations
water_cplp_access_2004 <- water_cplp |>
filter(
year == 2004,
varname_short %in% c("wat_bas", "wat_lim"),
residence != "Total") |>
mutate(
percent = parse_number(percent)) # handles values like >99%
ggplot(data = water_cplp_access_2004,
mapping = aes(x = residence,
y = percent,
fill = varname_long)) +
geom_col() +
facet_wrap(~country) +
scale_fill_calc(name = NULL) +
geom_text(aes(label = round(percent, 1)),
position = position_stack(vjust = 0.5),
size = 3,
colour = "white")
# check the differences in accessing drinking water between rural and urban populations
water_cplp_access_2024 <- water_cplp |>
filter(
year == 2024,
varname_short %in% c("wat_bas", "wat_lim"),
residence != "Total") |>
mutate(
percent = parse_number(percent)) # handles values like >99%
ggplot(data = water_cplp_access_2024,
mapping = aes(x = residence,
y = percent,
fill = varname_long)) +
geom_col() +
facet_wrap(~country) +
scale_fill_calc(name = NULL) +
geom_text(aes(label = round(percent, 1)),
position = position_stack(vjust = 0.5),
size = 3,
colour = "white")
All CPLP countries experienced improvements in access to at least basic drinking water between 2004 and 2024.
Access to at least basic drinking water remains consistently higher than reliance on limited services in both rural and urban populations.
Rural–urban disparities are most pronounced in Angola, Guinea-Bissau, and Mozambique, where rural populations continue to exhibit the lowest levels of access to basic drinking water.
Angola, Guinea-Bissau, Mozambique, São Tomé and Príncipe, and Timor-Leste increased their reliance on limited drinking water services in 2024.
Figure 3 presents country-level trajectories of access to drinking water services in Angola, Cabo Verde, Mozambique, and Timor-Leste between 2004 and 2024.
# summarise the evolution in access to drinking water in cplp between 2004 and 2024
water_cplp_evol <- water_cplp |>
filter(
residence == "Total",
varname_short %in% c("wat_bas", "wat_lim"),
year >= 2004,
year <= 2024
) |>
mutate(percent = parse_number(percent)) |>
select(country, year, varname_short, percent) |>
pivot_wider(
names_from = varname_short,
values_from = percent)
# select country for analysis - write country of interest
countries_to_plot <- c("Angola", "Cabo Verde", "Mozambique", "Timor-Leste")
water_subset <- water_cplp_evol |>
filter(country %in% countries_to_plot)
global_max_lim <- max(water_subset$wat_lim, na.rm = TRUE)
scale_factor <- 100 / global_max_lim # maps wat_lim onto 0–100 scale
plots <- list() # empty list to store each country's plot
for (i in seq_along(countries_to_plot)) {
country_loop <- countries_to_plot[i]
df_country <- water_subset |>
filter(country == country_loop)
# plot evolution for each selected country
plot <- ggplot(data = df_country,
mapping = aes(x = year)) +
# Left axis: wat_bas
geom_point(aes(y = wat_bas, colour = "At least basic"), size = 1) +
geom_line(aes(y = wat_bas, colour = "At least basic"), linewidth = 0.5) +
# Right axis: wat_lim (rescaled)
geom_point(aes(y = wat_lim * scale_factor, colour = "Limited"), size = 1) +
geom_line(aes(y = wat_lim * scale_factor, colour = "Limited"),
linewidth = 0.5, linetype = "dashed") +
scale_colour_manual(
values = c("At least basic" = "steelblue", "Limited" = "coral2"),
name = NULL) +
labs(title = paste(country_loop),
x = "Year",
y = "Percentage (%)") +
scale_y_continuous(
name = "At least basic drinking water (%)",
limits = c(0, 100),
sec.axis = sec_axis(~ . / scale_factor,
name = "Limited drinking water (%)")) +
theme_minimal(base_size = 8) +
theme(legend.position = "bottom")
# store this plot in the list
plots[[i]] <- plot}
# 2x2 grid with patchwork
(plots[[1]] | plots[[2]]) /
(plots[[3]] | plots[[4]]) +
plot_annotation(
title = "Country-level trajectories in access to drinking water (2004–2024)",
theme = theme(
plot.title = element_text(size = 12, face = "bold", hjust = 0.5)))
Access to at least basic drinking water increased steadily over the two-decade period.
Improvements in basic access do not always coincide with reductions in reliance on limited services.
Cabo Verde and Timor-Leste show relatively high and sustained levels of basic water access by 2024.
Angola and Mozambique exhibit slower and more uneven progress.
Table 1 and Table 2 compare access to “at least basic” drinking water in each CPLP country with the corresponding SDG regional average in 2004 and 2024. This comparison contextualises national performance within broader regional trends and highlights whether countries are performing above, below, or in line with their respective SDG region.
# summarise the progress in access to drinking water in each sdg_region between 2004 and 2024
water_region <- water |>
filter(
varname_short == "wat_bas",
residence == "Total",
year >= 2004,
year <= 2024
) |>
mutate(
percent = na_if(percent, "-"),
percent = parse_number(percent)) |>
group_by(sdg_region, year) |>
summarise(
regional_mean = mean(percent, na.rm = TRUE),
.groups = "drop")
# Compare access to drinking water in cplp to corresponding sdg_region
cplp_iso3 <- c("PRT","BRA","AGO","CPV","GNB","MOZ","STP","TLS")
water_cplp_vs_region <- water |>
filter(
iso3 %in% cplp_iso3,
varname_short == "wat_bas",
residence == "Total",
year >= 2004,
year <= 2024) |>
mutate(
percent = na_if(percent, "-"),
percent = parse_number(percent)) |>
left_join(
water_region,
by = c("sdg_region", "year")) |> # join CPLP rows to their region-year means
mutate(
diff_from_region = percent - regional_mean, # positive = better than region
level_to_region = case_when(
diff_from_region > 0 ~ "Above regional average",
diff_from_region < 0 ~ "Below regional average",
TRUE ~ "At regional average"))
water_cplp_vs_region |>
filter(year == 2004) |>
select(country, sdg_region, income_id, percent, regional_mean, level_to_region) |>
mutate(
percent = round(percent, 1),
regional_mean = round(regional_mean, 1)) |>
gt(groupname_col = "varname_short") |>
cols_label(
country = "Country",
sdg_region = "SDG Region",
income_id = "Income Level",
percent = "Country (%)",
regional_mean = "Regional Mean (%)",
level_to_region = "Level to Region") |>
fmt_number(columns = c(percent, regional_mean),
decimals = 1) |>
tab_options(
table.font.size = px(11),
heading.title.font.size = px(12),
column_labels.font.size = px(12),
data_row.padding = px(1)
)| Country | SDG Region | Income Level | Country (%) | Regional Mean (%) | Level to Region |
|---|---|---|---|---|---|
| Angola | Sub-Saharan Africa | Lower middle income | 42.0 | 59.7 | Below regional average |
| Brazil | Latin America and the Caribbean | Upper middle income | 94.8 | 93.4 | Above regional average |
| Cabo Verde | Sub-Saharan Africa | Lower middle income | 80.7 | 59.7 | Above regional average |
| Guinea-Bissau | Sub-Saharan Africa | Low income | 56.9 | 59.7 | Below regional average |
| Mozambique | Sub-Saharan Africa | Low income | 28.9 | 59.7 | Below regional average |
| Portugal | Europe and Northern America | High income | 99.0 | 97.7 | Above regional average |
| Sao Tome and Principe | Sub-Saharan Africa | Lower middle income | 71.1 | 59.7 | Above regional average |
| Timor-Leste | Eastern and South-Eastern Asia | Lower middle income | 53.0 | 83.8 | Below regional average |
water_cplp_vs_region |>
filter(year == 2024) |>
select(country, sdg_region, income_id, percent, regional_mean, level_to_region) |>
mutate(
percent = round(percent, 1),
regional_mean = round(regional_mean, 1)) |>
gt(groupname_col = "varname_short") |>
cols_label(
country = "Country",
sdg_region = "SDG Region",
income_id = "Income Level",
percent = "Country (%)",
regional_mean = "Regional Mean (%)",
level_to_region = "Level to Region") |>
fmt_number(columns = c(percent, regional_mean),
decimals = 1) |>
tab_options(
table.font.size = px(11),
heading.title.font.size = px(12),
column_labels.font.size = px(12),
data_row.padding = px(1)
)| Country | SDG Region | Income Level | Country (%) | Regional Mean (%) | Level to Region |
|---|---|---|---|---|---|
| Angola | Sub-Saharan Africa | Lower middle income | 68.0 | 73.4 | Below regional average |
| Brazil | Latin America and the Caribbean | Upper middle income | 99.0 | 96.3 | Above regional average |
| Cabo Verde | Sub-Saharan Africa | Lower middle income | 92.3 | 73.4 | Above regional average |
| Guinea-Bissau | Sub-Saharan Africa | Low income | 61.8 | 73.4 | Below regional average |
| Mozambique | Sub-Saharan Africa | Low income | 66.6 | 73.4 | Below regional average |
| Portugal | Europe and Northern America | High income | 99.0 | 98.3 | Above regional average |
| Sao Tome and Principe | Sub-Saharan Africa | Lower middle income | 78.0 | 73.4 | Above regional average |
| Timor-Leste | Eastern and South-Eastern Asia | Lower middle income | 87.3 | 94.6 | Below regional average |
Although all CPLP countries improved access to drinking water over time, several did not converge toward regional averages by 2024.
CPLP countries located in Sub-Saharan Africa, specifically Angola, Guine-Bissau, and Mozambique, remain below their regional average for access to at least basic drinking water.
Brazil, Cabo Verde, Portugal, and São Tomé and Príncipe perform above their respective SDG regional averages in both 2004 and 2024.
In summary:
Access to drinking water has improved across all CPLP countries between 2004 and 2024, particularly for at least basic services; however, progress remains uneven, with persistent rural–urban gaps and heterogeneous country-level trajectories indicating ongoing challenges in extending services equitably;
The continued reliance on limited drinking water services in multiple countries, despite gains in basic access, highlights structural constraints in transitioning along the drinking water service ladder;
CPLP countries with low and lower-middle-income remain behind their SDG regional averages, suggesting that economic capacity play a critical role in shaping water service expansion and convergence toward SDG 6 targets (Valero et al. 2023; Arora and Mishra 2022).
GenAI ChatGPT was used to assist with code debugging and language refinement. All data analysis, interpretation, and conclusions are the author’s own.