Evaluation of Drinking Water Quality from Borehole Sources to Household Storage

Author
Affiliation

Cecilia Nansiimbi

Published

January 23, 2026

Abstract
This study assessed the microbiological quality of drinking water from borehole sources to household use in Uganda districts, Kumi, Kaabong, Nakaseke and Luuka. Monthly data collection involved sampling water at the source, transportation containers, and storage containers, with E. coli analysed using the membrane filtration method. Borehole results revealed fluctuations in E. coli levels over monthly monitoring, with the majority of samples falling within the safe and slightly contaminated categories. However, high numbers of unsafe water samples were observed during transportation and storage, indicating that water considered safe at the source is at risk of contamination before reaching the point of use. Strengthening water safety measures including regular cleaning and disinfection of containers, household water treatment (e.g., chlorination or boiling), and proper borehole protection through intact aprons and effective drainage can improve water quality at the point of use.

Introduction

This report presents an assessment of microbiological drinking water quality testing from the point of abstraction(boreholes) to household use in Kumi, Kaabong, Nakaseke and Luuka districts of Uganda. The study examines changes in contamination levels at three key stages that is to say, borehole water sources, household water transportation containers, and household drinking water storage containers.

Objective

The main aim of this analysis is to classify E. coli results into five contamination levels for each sample type and to compare results across the water chain. This approach supports identification of where contamination is most predominant at the source, during collection and transport, or during household storage.

Methods

This section describes methods used to collect the data used for this analysis.

Data Collection

Water quality monitoring was conducted monthly from November 2024 to September 2025. Sampling information and microbial results were recorded digitally using the mWater Surveyor application, with data captured in two separate forms.

For each borehole, water samples were collected directly from the source and from three households that relied on that borehole. At the household level, samples were taken from both the jerrycan used for water collection and the container used for drinking water storage.

All samples were transported to accredited local laboratories, where they were analysed for E. coli concentration (per 100 mL) using the membrane filtration method, incubated for 24hrs and results recorded in an Mwater survey.

Code
library(tidyverse)
library(here)
library(readxl)
install.packages("ggthemes")
library(ggthemes)
library(gt)
library(gtsummary)
library(knitr)
library(DT)
Code
#Importing the datasets

raw_data1 <- readxl::read_excel(here::here("data", "raw", "Sample_collection.xlsx"))

raw_data2 <- readxl::read_excel(here::here("data", "raw", "Sample_E.coliresults.xlsx"))
Code
#Selecting relevant columns

#raw_data1: Sample collection

sample_data <- raw_data1|>
  select(
    `Enter date and time of sampling`,
    `Select the District`,
    `Select the region`,
    `Select the type of survey being taken`,
    `Enter the source name`, 
    `Ask a community member to make notes on the taste of the water`, 
    `Ask a community member to make notes about the smell of the water`, 
    `What is the colour of the sampled water`,
    `Select the current type of storage technology/ container for drinking water being used`, 
    `Please specify the type of storage container for drinking water`,
    `What medium/ type of container do they use to collect water from the borehole?`, 
    `Take a sample from the transportation medium/ container used by the household and enter sample bottle number`, 
    `Take another sample from the household storage container and enter the sample bottle number.`,
    `Enter the sample bottle number`)

#raw_data2: E.coli testing results

sample_result <- raw_data2 |>
  select(`Enter the date and time now`,
         `Select region`,
         `Select district`,
         `Enter sample bottle number`,
         `Enter sample petri dish number`,
         `Enter sample incubation start time`,
         `Count the number of colonies and enter the number here. If there are >100 colonies, meaning "too numerous to count", enter 101.`,
         `What was the volume of water that was filtered when this sample was analysed?`
         )
Code
#Renaming columns

#sample_data
samples_data <- sample_data |> rename(date = `Enter date and time of sampling`,
    district = `Select the District`,
    region = `Select the region`,
    survey = `Select the type of survey being taken`,
    source = `Enter the source name`, 
    taste = `Ask a community member to make notes on the taste of the water`, 
    smell = `Ask a community member to make notes about the smell of the water`, 
    color = `What is the colour of the sampled water`,
    storage_container = `Select the current type of storage technology/ container for drinking water being used`, 
    other_container = `Please specify the type of storage container for drinking water`,
    tp_container = `What medium/ type of container do they use to collect water from the borehole?`, 
    storage_num = `Take a sample from the transportation medium/ container used by the household and enter sample bottle number`, 
    tp_num = `Take another sample from the household storage container and enter the sample bottle number.`,
    bh_num = `Enter the sample bottle number`)

#sample_result

samples_result <- sample_result |>
  rename(result_date = `Enter the date and time now`,
         region = `Select region`,
         district = `Select district`,
         sample_num = `Enter sample bottle number`,
         dish_num = `Enter sample petri dish number`,
         test_date = `Enter sample incubation start time`,
        E.coli_CFUs = `Count the number of colonies and enter the number here. If there are >100 colonies, meaning "too numerous to count", enter 101.`,
         volume = `What was the volume of water that was filtered when this sample was analysed?`
         )

samples_data$date_column <- as.Date(samples_data$date, format = "%Y-%m-%d")
Code
#Transferring all sample names and bottle numbers into one column

#samples names
samples_names_2425 <- samples_data |>
  filter(date_column >= as.Date("2024-11-1") & date_column <= as.Date("2025-9-22"))|>
  mutate(bh = case_when(
    survey == "Borehole"  & !is.na(source) ~ source ,
    TRUE~ NA_character_)) |>
  relocate(bh, .before = storage_container) |>
 mutate(storage_container = ifelse(
    storage_container == "Other (please specify)", 
    other_container,  
    storage_container )) |>
  relocate(other_container, .before = bh) |>
  relocate(date_column, .before = date)

samples_names_2425|>
  count(survey)
  
#Creating all samples names column 

 samples_names_final <-samples_names_2425 |>
   pivot_longer(
    cols = c(bh,storage_container, tp_container), 
    names_to = "Sample_Type", 
    values_to = "Sample_Name")
 
 samples_names_final|>
   count(survey)
 
 samples_names_final |>
   count(Sample_Type)

 samples_names_final |>
   count(Sample_Name)
 
 #Creating all sample collection bottle numbers column

samples_data_final <- samples_names_final |> 
  mutate(Sample_Num = 
           case_when(Sample_Type == "bh" ~ bh_num,
                     Sample_Type == "storage_container" ~ storage_num,
                     Sample_Type == "tp_container" ~ tp_num,
                     .default = NA)) |> 
  filter(!is.na(Sample_Num))


glimpse(samples_data_final)

samples_data_final |>
  count(Sample_Type)
 
#Final cleaned sample collection data
samples_data_cleaned <-samples_data_final |>
  filter(!is.na(Sample_Name)) |>
  select(date_column,
         district,
         region,
         survey,
         source,
         taste,
         smell,
         color,
         Sample_Type,
         Sample_Name,
         Sample_Num) |>
  arrange(desc(region), desc(date_column))|> rename(sample_type = Sample_Type,
         sample_name = Sample_Name,
         sample_num = Sample_Num,
         test_date = date_column)  |> mutate(region = case_when(
    region == "Busoga Region" ~ "Busoga Region",
    region == "Busoga Region, Central Region, Teso Region" ~ "Busoga Region",
    region == "Central Region" ~ "Central Region",
    region == "Karamoja Region" ~ "Karamoja Region",
    region == "Teso Region" ~ "Teso Region",
    region == "Teso Region, Central Region, Busoga Region" ~ "Central Region",
    region == "Teso Region, Karamoja Region" ~ "Teso Region",
    TRUE ~ region  
  )) |> 
  mutate(district = case_when(
    region == "Teso Region" ~ "Kumi",
    region == "Busoga Region" ~ "Kamuli",
    region == "Central Region" ~ "Nakaseke",
    region == "Karamoja Region" ~ "Kabong",
    TRUE ~ district )) 
  
samples_data_cleaned |> count(survey)

samples_data_cleaned |>
  count(region)

samples_data_cleaned |>
  count(district)
samples_data_cleaned |>
  count(sample_type)
Code
#Final cleaned sample E.coli result

samples_result$result_date<- as.Date(samples_result$result_date, format = "%Y-%m-%d")

samples_result$test_date<- as.Date(samples_result$test_date, format = "%Y-%m-%d")

#Matching dates as in samples data df

samples_result_2425<- samples_result |> filter(result_date >= as.Date("2024-11-1") & result_date <= as.Date("2025-9-22")) |>
  filter(test_date >= as.Date("2024-11-1") & test_date <= as.Date("2025-9-22"))
  
#Removing redundant columns
  samples_result_final <- samples_result_2425|>select(district,region,
         sample_num,
         dish_num,
         E.coli_CFUs,
         volume,
         test_date,
         result_date) |> mutate(region = case_when(
    region == "Busoga Region" ~ "Busoga Region",
    region == "Busoga Region, Central Region, Teso Region" ~ "Busoga Region",
    region == "Central Region" ~ "Central Region",
    region == "Karamoja Region" ~ "Karamoja Region",
    region == "Teso Region" ~ "Teso Region",
    region == "Teso Region, Central Region, Busoga Region" ~ "Central Region",
    region == "Teso Region, Karamoja Region" ~ "Teso Region",
    TRUE ~ region)) |>  mutate(district = case_when(
    region == "Teso Region" ~ "Kumi",
    region == "Busoga Region" ~ "Kamuli",
    region == "Central Region" ~ "Nakaseke",
    region == "Karamoja Region" ~ "Kabong",
    TRUE ~ district )) |> arrange(desc(region), desc(result_date)) 
  
  
samples_result_final |>
  count(region)

  samples_result_final |>
    count(district)
  
  samples_result_final |>
    count(test_date)
Code
#Joining the sample data and sample result 

samples_joined <- samples_data_cleaned %>% 
  
    right_join(samples_result_final, by = c("test_date" = "test_date", 
                                           "sample_num" = "sample_num",
                                           "region" = "region",
                                           "district" = "district")) |>
  relocate(result_date, .after = test_date) 
samples_joined |>
  filter(!is.na(survey)) 

  samples_joined |> count(region)
  
 samples_joined |> count(survey) 

samples_joined |> count(sample_type)

head(samples_joined)
Code
#Correcting typos in rows of the final dataset containing both sample collection and sample result data

samples_joined_regions <- samples_joined |> mutate(region = case_when(
    region == "Busoga Region" ~ "Busoga Region",
    region == "Busoga Region, Central Region, Teso Region" ~ "Busoga Region",
    region == "Central Region" ~ "Central Region",
    region == "Karamoja Region" ~ "Karamoja Region",
    region == "Teso Region" ~ "Teso Region",
    region == "Teso Region, Central Region, Busoga Region" ~ "Central Region",
    region == "Teso Region, Karamoja Region" ~ "Teso Region",
    TRUE ~ region  
  )) 

samples_joined_regions |>count(region)

 samples_joined_regions |> count(district)
 
 samples_joined_regions|> count(survey)

samples_joined_district <-samples_joined_regions |>
  mutate(district = case_when(
    region == "Teso Region" ~ "Kumi",
    region == "Busoga Region" ~ "Kamuli",
    region == "Central Region" ~ "Nakaseke",
    region == "Karamoja Region" ~ "Kabong",
    TRUE ~ district )) 

samples_joined_district|> count(district)

test_data <- samples_data |> 
  filter(date_column == "2025-03-18",
         storage_num == 5 | tp_num == 5 | bh_num == 5)

rm(test_data)

rm(samples_joined_district)

rm(samples_joined_regions)
Code
#Processed data
write_csv(samples_joined,
          here::here("data/processed/processed_sample_data.csv"))

Analysis

E. coli results were categorised into five contamination levels based on colony-forming units (CFUs). Samples with 0 CFUs were classified as safe; those with 1–10 CFUs as slightly contaminated; 11–50 CFUs as moderately contaminated; 51–100 CFUs as highly contaminated; and samples with more than 100 CFUs as unsafe. This analysis was conducted for all samples with complete data, including samples from boreholes, jerrycans, and household drinking water storage containers.

Code
#Analysis
#Categorising contamination levels

samples_joined_categories <- samples_joined %>%
  mutate(contamination_category = case_when(
    E.coli_CFUs == 0 ~ "Safe",
    E.coli_CFUs >= 1 & E.coli_CFUs <= 10 ~ "Slightly Contaminated",
    E.coli_CFUs >= 11 & E.coli_CFUs <= 50 ~ "Moderately Contaminated",
    E.coli_CFUs >= 51 & E.coli_CFUs <= 100 ~ "Highly Contaminated",
    E.coli_CFUs > 100 ~ "Unsafe",
    TRUE ~ NA_character_  
  ))

levels_categories <- c("Safe", "Slightly Contaminated", "Moderately Contaminated", "Highly Contaminated", "Unsafe")

samples_joined_categories_lvl <- samples_joined_categories |>
  mutate(contamination_category = factor(contamination_category, levels = levels_categories ))

samples_joined_categories_lvl |>
  count(survey)
survey_counts <- samples_joined_categories_lvl %>%
  group_by(survey, region) %>%
  summarise(count = n(), .groups = "drop")  

rm (samples_joined_categories_surveytype)

Results

While contamination levels fluctuated between monitoring months, the majority of boreholes consistently fell within the safe and slightly contaminated categories, indicating generally acceptable water quality. A study by Lapworth et al. (2017) shows that boreholes generally have better microbiological water quality than shallower water sources, since deeper groundwater is less vulnerable to surface contamination than shallow sources. Figure 1 shows water quality results for boreholes tested.

Code
#visualisation

#Boreholes

boreholes_samples <- samples_joined_categories_lvl |>
  filter(survey == "Borehole")


ggplot(data = boreholes_samples,
       mapping = aes(x = contamination_category,
                     fill = contamination_category)) +
  geom_bar(position = "dodge") +
  facet_wrap("district") +
    labs(title = "E. Coli Contamination Levels in Boreholes by Region",
       x = "Contamination Category",
       y = "Number of Boreholes") +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))+
  scale_fill_manual(values = c("Safe" = "#4CAF50",               
                               "Slightly Contaminated" = "#2196F3",
                               "Moderately Contaminated" = "#FFC107",  
                               "Highly Contaminated" = "#FF9800", 
                               "Unsafe" = "#F44336"))
    

#Containers
 containers_samples <- samples_joined_categories_lvl |>
 filter(survey == "Household") |> mutate(sample_name = case_when(
    sample_name == "BUCKET" ~ "Bucket",
    sample_name == "Backet" ~ "Bucket",
    sample_name == "Big Bucket" ~ "Bucket",
    sample_name == "Big bucket" ~ "Bucket",
    sample_name == "Cylinder" ~ "Bucket",
    sample_name == "Cylinder bucket" ~ "Bucket",
    sample_name == "PLASTIC DRUM" ~ "Bucket",
    sample_name == "Plastic drum" ~ "Drum",
    sample_name == "DRUM" ~ "Drum",
    sample_name == "Drum" ~ "Drum",
    sample_name == "POT"  ~ "Pot",
    sample_name == "Pot"  ~ "Pot",
    sample_name == "Sauce pan" ~ "Sauce pan",
    sample_name == "Jerry can" ~ "Jerry can",
    TRUE ~ sample_name )) |>
   mutate(sample_type = case_when(
     sample_type == "tp_container" ~ "Transport container",
     sample_type == "storage_container" ~ "Storage container"
   ))
Figure 1: Borehole E.coli testing results

Figure 2 shows E.coli testing results for samples collected from household drinking water storage containers and transportation containers from water sources.

Code
ggplot(data = containers_samples,
       mapping = aes(x = sample_name,
                     fill = contamination_category)) +
  geom_bar(position = "dodge") +
  facet_wrap("sample_type",labeller = label_wrap_gen(width = 10),scales = "free_x") +
    labs(title = "E. Coli Contamination Levels in Transportation and Storage Containers",
       x = "Container Type",
       y = "Number of Containers") +
  theme(axis.text.x = element_text(angle = 45, hjust = 1, size = 6),
   axis.title.x = element_text(size = 10),  
        axis.title.y = element_text(size = 12),
        plot.title = element_text(size = 12), 
        legend.title = element_text(size = 12),
        legend.text = element_text(size = 10)) +    scale_fill_manual(values = c("Safe" = "#4CAF50",               
                               "Slightly Contaminated" = "#2196F3",
                               "Moderately Contaminated" = "#FFC107",  
                               "Highly Contaminated" = "#FF9800", 
                               "Unsafe" = "#F44336"))
Figure 2: Transport container and storage container E.coli testing results

Table 1 summarises the total numbers of different types of samples that were collected under each contamination category.

Code
category_table <- samples_joined_categories_lvl |>
  group_by(contamination_category, sample_type) |>
  summarise(count = n(), .groups = "drop") |>
  pivot_wider(names_from = sample_type, values_from = count, values_fill = 0) |>
  rename(Borehole = bh,
         `Storage container` = storage_container,
         `Transport container` = tp_container,
         `Contamiation Category`  = contamination_category)


category_table |>
    gt() |>
    tab_header(title = "Counts of Boreholes and Container Types by Contamination Category",
               subtitle = "Data from Four districts in Uganda") 
Table 1: Counts of Boreholes and Container Types by Contamination Category. Data from 4 districts.
Counts of Boreholes and Container Types by Contamination Category
Data from Four districts in Uganda
Contamiation Category Borehole Storage container Transport container NA
Safe 112 188 183 106
Slightly Contaminated 116 144 97 50
Moderately Contaminated 61 114 128 52
Highly Contaminated 26 62 51 15
Unsafe 91 465 510 100

Table 2 summarises the most popular type of containers used for both water collection and storage of drinking water in the sampled communities.

Code
container_table <- containers_samples |>
  filter(survey == "Household") |>
  group_by(district, sample_name) |>
  summarise(count = n(), .groups = "drop") |>
  group_by(district) |>
  mutate(percent = round(count / sum(count) * 100, 1)) |>
  select(-count) |>
  pivot_wider(
    names_from = sample_name,
    values_from = percent,
    values_fill = list(percent = 0)
  ) |>
  rename(District = district)|>
  ungroup()

names(container_table) <- ifelse(
  names(container_table) == "district",
  "district",
  paste0(names(container_table), " (%)")
)

container_table |>
    gt() |>
    tab_header(title = "Container types percentage for transport and storage of drinking water",
               subtitle = "Data from Four districts in Uganda") 
Table 2: Container types for collection and storage of drinking water
Container types percentage for transport and storage of drinking water
Data from Four districts in Uganda
District (%) Bucket (%) Drum (%) Jerry can (%) Other (%) Pot (%) Sauce pan (%) Small drum (%)
Kabong 13.9 8.7 52.7 1.5 22.1 1.1 0.0
Kamuli 0.7 0.0 50.0 0.0 49.3 0.0 0.0
Kumi 0.0 0.0 49.8 0.0 50.2 0.0 0.0
Nakaseke 1.9 1.0 50.4 0.0 42.8 3.7 0.2

Conclusions

  • Borehole water quality monitoring across the different months revealed that Borehole quality is highly variable and this significantly affects the quality of water received at household levels.

  • Although drinking water may be safe at the source, contamination can occur during transportation, preventing safe delivery to users(Gärtner et al. 2021).

  • Poor handling practices and unhygienic transport containers increase the risk of water quality deterioration before storage.

  • Strengthening water safety measures during transportation and promoting household water treatment methods such as chlorination and boiling can significantly improve the quality of drinking water at the point of use.

Recommendations

  • Thorough cleaning of drinking water containers with safe cleaning materials to ensure there is no recontamination during drinking water storage (Meierhofer, Wietlisbach, and Matiko 2019).

  • Regular cleaning of transport containers with disinfectants to prevent contamination of safe water during transport.

  • According to Kelly et al. (2021), conducting regular sanitary inspections and proper protection of boreholes through sanitary infrastructure such as intact concrete aprons and effective drainage can reduce pathways for contaminants entering the water source, since cracked aprons and poor drainage are statistically linked to increased microbial contamination.

References

Gärtner, Nicola, Laura Germann, Kennedy Wanyama, Henry Ouma, and Regula Meierhofer. 2021. “Keeping Water from Kiosks Clean: Strategies for Reducing Recontamination During Transport and Storage in Eastern Uganda.” Water Research X 10 (January): 100079. https://doi.org/10.1016/j.wroa.2020.100079.
Kelly, Emma, Ryan Cronk, Michael Fisher, and Jamie Bartram. 2021. “Sanitary Inspection, Microbial Water Quality Analysis, and Water Safety in Handpumps in Rural Sub-Saharan Africa.” Npj Clean Water 4 (1). https://doi.org/10.1038/s41545-020-00093-z.
Lapworth, D. J., D. C. W. Nkhuwa, J. Okotto-Okotto, S. Pedley, M. E. Stuart, M. N. Tijani, and J. Wright. 2017. “Urban Groundwater Quality in Sub-Saharan Africa: Current Status and Implications for Water Security and Public Health.” Hydrogeology Journal 25 (4): 1093–1116. https://doi.org/10.1007/s10040-016-1516-6.
Meierhofer, Regula, Basil Wietlisbach, and Carol Matiko. 2019. “Influence of Container Cleanliness, Container Disinfection with Chlorine, and Container Handling on Recontamination of Water Collected from a Water Kiosk in a Kenyan Slum.” Journal of Water and Health 17 (2): 308–17. https://doi.org/10.2166/wh.2019.282.

Reuse

Citation

BibTeX citation:
@online{nansiimbi2026,
  author = {Nansiimbi, Cecilia},
  title = {Evaluation of {Drinking} {Water} {Quality} from {Borehole}
    {Sources} to {Household} {Storage}},
  date = {2026-01-23},
  langid = {en},
  abstract = {This study assessed the microbiological quality of
    drinking water from borehole sources to household use in Uganda
    districts, Kumi, Kaabong, Nakaseke and Luuka. Monthly data
    collection involved sampling water at the source, transportation
    containers, and storage containers, with E. coli analysed using the
    membrane filtration method. Borehole results revealed fluctuations
    in E. coli levels over monthly monitoring, with the majority of
    samples falling within the safe and slightly contaminated
    categories. However, high numbers of unsafe water samples were
    observed during transportation and storage, indicating that water
    considered safe at the source is at risk of contamination before
    reaching the point of use. Strengthening water safety measures
    including regular cleaning and disinfection of containers, household
    water treatment (e.g., chlorination or boiling), and proper borehole
    protection through intact aprons and effective drainage can improve
    water quality at the point of use.}
}
For attribution, please cite this work as:
Nansiimbi, Cecilia. 2026. “Evaluation of Drinking Water Quality from Borehole Sources to Household Storage.” January 23, 2026.