The data shows FSTP locations in the Rohingya Refugee camps at Cox’s Bazaar. It has general information on FSTP operator, contact info, FSTP location and type. Furthermore it has the info on samples collected from the FSTPs and their characterization results. There are 2454 observations and 48 variables in the raw data.
My goal is to first find some basic information about the data. Then in the next step I will analyze, for each camp, calculate the mean of effluent quality for each camp with multiple FSTPs, review what the effluent quality change was over the years and map the FSTP locations based on the effluent quality to get a geographic visualization of areas are doing better or worse.
2. Methods
Raw effluent data was imported to R and cleaned by removing samples with missing data (NA) for Nitrate, Phosphate, BOD, COD and E.coli. The data variables to inlclude in the analysis were also selected and the columns renamed. This resulted in a processed database of 1412 observations with 13 variables.
Within the processed data, first the data was summarized by camp to get some basic information about the camps - number of camps, number of FSTPs total and by camp, years of sample collection and types of FSTP.
a. Total counts of FSTPs and unique technologies used
Number of unique FSTPs in this processed database is 269. There are 33 camps in the area utilizing 14 varieties of FSTP technology.
Code
# Total number of unique FSTPsprocessed_data |>summarise(total_unique_FSTP_count =n_distinct(FSTPID)) |>print()
# A tibble: 1 × 1
total_unique_FSTP_count
<int>
1 269
Code
# Total number of unique FSTP typesprocessed_data |>summarise(total_unique_FSTP_type_count =n_distinct(FSTP_type)) |>print()
# A tibble: 1 × 1
total_unique_FSTP_type_count
<int>
1 14
b. Summarizing the data by Year
Summarizing the processed data year and collected samples it is found that the collection years were 2021 through 2025.
NOTE: In 2021 only 5 samples were collected. So 2021 will be omitted from further analysis. Furthermore, upon trying to plot the sample count it was found that some FSTP technology have only a few FSTPs active (less than 10) so dropping them from the plot.
Code
processed_data <- processed_data |>mutate(Date_collect =as.Date(Date_collect),Year =year(Date_collect)) |>filter(Year !=2021) # Exclude the year 2021# Count samples collected by yearsamples_per_year <- processed_data |>group_by(Year, FSTP_type) |>summarise(Sample_Count =n(), .groups ='drop')|>filter(Sample_Count >=10)print(samples_per_year)
ggplot(data = samples_per_year,mapping =aes(x = Year, y = Sample_Count,fill = FSTP_type)) +geom_bar(stat ="identity", position ="dodge") +labs(x ="Year", y ="Number of Samples", title ="Number of Samples Collected by Year and FSTP Type") +scale_fill_colorblind() +theme_minimal()
Figure 1: Number of Samples Collected by Year and FSTP Type
Figure 1 shows the distribution of samples collected over the years by type of FSTP technology utilized. ABR and UFF have the highest samples collected closely followed by DEWATS and SSU. It is evident that sample collection dropped significantly in 2024, from 2023.
c. Summarizing the data by Camp and FSTP technology
Code
# Summarize the data by CampID and FSTP Typesummary_by_camp_and_type <- processed_data |>group_by(CampID, FSTP_type) |>summarise(unique_FSTP_count =n_distinct(FSTPID), # Count unique FSTP total_FSTP_count =n(), # Total number FTSP.groups ='drop' )glimpse(summary_by_camp_and_type)
# Summarize the data by FSTP Typesummary_by_FSTPtype <- processed_data |>group_by(FSTP_type) |>summarise(unique_FSTP_count =n_distinct(FSTPID), .groups ='drop' )glimpse(summary_by_FSTPtype)
# Calculate total counts for each CampID for labelstotals <- summary_by_camp_and_type |>group_by(CampID) |>summarise(total_count =sum(unique_FSTP_count), .groups ='drop')# Create a bar plot of FSTP Type counts by Campggplot(summary_by_camp_and_type, aes(x = CampID, y = unique_FSTP_count, fill = FSTP_type)) +geom_bar(stat ="identity") +# Stacked bars by defaultgeom_text(data = totals, aes(x = CampID, y = total_count, label = total_count), vjust =-0.5, size =3, inherit.aes =FALSE) +# Position the total label above the barlabs(title ="Diversity of FSTP Technology by Camp", x ="Camp ID", y ="Number of FSTP",fill ="FSTP Type") +theme_minimal() +theme(axis.text.x =element_text(angle =90, hjust =1)) +# Rotate x labels to verticalscale_fill_viridis(discrete =TRUE)
Figure 2: Diversity of FSTP Technology by Camp
The various FSTP technologies were summarized by camp. Figure 2 shows Diversity of FSTP technology by camp in Cox’s Bazar where the number of FSTP installations per camp represents different FSTP types with colors. The Viridis color palette was used to ensure all categories are displayed, are visually distinct, colorblind-accessible, and perceptually uniform . Total counts per camp are displayed above each bar.
The bars show that 4 camps - 02W, 13, 15 and 25 are utilizing the highest 4 types each. This made sense for Camp 15 with highest 37 FSTPs, but in Camp 25 there are only 6 in. total!
Seeing the diversity in some of the camps, the co-ordinates of FSTPs were plotted in a map to geographically see the spatial distribution. Here, an interactive map of FSTP types in Cox’s Bazar was created using the Leaflet R package (Cheng et al. 2025) for mapping and RColorBrewer (Neuwirth 2022) for color palettes, with rare FSTP types grouped into ‘Other’ for clarity.
Code
processed_data <-read_csv(here::here("data/processed/CampWasteWater_processed.csv"))# Clean FSTP_type and group rare typesprocessed_data <- processed_data %>%mutate(FSTP_type =trimws(as.character(FSTP_type)))# Count occurrencestype_counts <- processed_data %>%count(FSTP_type, name ="n")# Group rare types into "Other"threshold <-20processed_data <- processed_data %>%left_join(type_counts, by ="FSTP_type") %>%mutate(FSTP_type =ifelse(n < threshold, "Other", FSTP_type)) %>%select(-n)# Recount frequencies after groupingfreq_counts <- processed_data %>%count(FSTP_type, name ="freq") %>%arrange(desc(freq)) # sort descending# Define legend order: top types first, "Other" lastlegend_order <- freq_counts$FSTP_type[freq_counts$FSTP_type !="Other"]if ("Other"%in% freq_counts$FSTP_type) legend_order <-c(legend_order, "Other")# Convert FSTP_type to factor with levels = legend_orderprocessed_data <- processed_data %>%mutate(FSTP_type =factor(FSTP_type, levels = legend_order))# Create palettepalette_colors <-brewer.pal(n =max(length(legend_order), 3), name ="Set2")[1:length(legend_order)]names(palette_colors) <- legend_orderpal <-colorFactor(palette = palette_colors,domain = processed_data$FSTP_type,na.color ="#B0B0B0")# --- Leaflet map ---leaflet(processed_data) %>%addTiles() %>%setView(lng =mean(processed_data$Long, na.rm =TRUE),lat =mean(processed_data$Lat, na.rm =TRUE),zoom =12 ) %>%addCircleMarkers(lng =~Long,lat =~Lat,radius =6,fillColor =~pal(FSTP_type),fillOpacity =0.8,stroke =FALSE,popup =~paste("<b>Camp:</b>", CampID, "<br>","<b>FSTP Type:</b>", FSTP_type) ) %>%addLegend(position ="bottomright",pal = pal,values = processed_data$FSTP_type, # now a factor with levels in ordertitle ="FSTP Types",opacity =0.8 )
Figure 3: Map of FSTP Locations by Technology Type in Cox’s Bazar, Bangladesh
Figure 3 shows the location distribution of FSTP. The color is by type of FSTP technology used. This helps to visualize where the various types of FSTPs are concentrated. Visually it seems like, certain areas like the north-west of the study area has more SSU, while north and south have UFF concentration, LSP more in the north and east, and ABR existing throughout the entire study area. There are a few FSTP locations showing in the sea or in Myanmar - which are due to incorrect coordinate collection.
3.2 Analysis of Effluent Quality
Table 1 shows the summary statistics of various effluent contents by the FSTP technology.
ggplot( median_values,aes(x = Year, y = Nitrate, color = FSTP_type)) +geom_line(linewidth =1.2) +geom_point(size =3) +labs(title ="Nitrate Concentration in Effluent by Year and FSTP Type",x ="Year",y ="Median Nitrate (mg/L)",color ="FSTP Type" ) +scale_color_viridis_d() +theme_minimal() +theme(plot.title =element_text(hjust =0.5) )
Figure 4: Nitrate in Effluent by Year and FSTP Type
Figure 4 shows that nitrate levels from various FSTP technologies each year, and they seem to follow a similar trend for all technologies except Biological process and other (omni processor, anaerobic lagoon, geotube etc.). Though always higher than the other technologies, Biological process shows a jump in nitrate levels in 2024 and 2025.
Code
ggplot( median_values,aes(x = Year, y = Phosphate, color = FSTP_type)) +geom_line(linewidth =1.2) +geom_point(size =3) +labs(title ="Phosphate Concentration in Effluent by Year and FSTP Type",x ="Year",y ="Median Phosphate (mg/L)",color ="FSTP Type" ) +scale_color_viridis_d() +theme_minimal() +theme(plot.title =element_text(hjust =0.5) )
Figure 5: Phosphate in Effluent by Year and FSTP Type
Figure 5 shows that phosphate levels from various FSTP technologies each year do not seem to follow any particular trend. LSP has consistently high phosphate concentrations over the years, but there is a jump in the concentrations for SSU and UFF technologies. WSP has one of lowest phosphate levels through the years.
Code
ggplot( median_values,aes(x = Year, y = BOD, color = FSTP_type)) +geom_hline(yintercept =30, # BECR 2023 standards for BODlinetype ="dashed",linewidth =1,color ="red" ) +annotate("text",x =min(median_values$Year),y =30,label ="30 mg/L threshold",hjust =0,vjust =-0.5,size =3.5,color ="red" ) +geom_line(linewidth =1.2) +geom_point(size =3) +labs(title ="BOD Concentration in Effluent by Year and FSTP Type",x ="Year",y ="Median BOD (mg/L)",color ="FSTP Type" ) +scale_color_viridis_d() +theme_minimal() +theme(plot.title =element_text(hjust =0.5) )
Figure 6: BOD in Effluent by Year and FSTP Type
Figure 6 shows that BOD levels from various FSTP technologies each year follow similar trend. LSP and SSU have consistently high BOD concentrations in effluent over the years, with WSP having one of the lowest. The dashed horizontal line indicates the reference threshold of 30 mg/L for BOD in treated effluent. None of the technologies are able to achieve that standard.
Code
ggplot( median_values,aes(x = Year, y = COD, color = FSTP_type)) +geom_hline(yintercept =125, # BECR 2023 standards for CODlinetype ="dashed",linewidth =1,color ="red" ) +annotate("text",x =min(median_values$Year),y =125,label ="125 mg/L threshold",hjust =0,vjust =-0.5,size =3.5,color ="red" ) +geom_line(linewidth =1.2) +geom_point(size =3) +labs(title ="COD Concentration in Effluent by Year and FSTP Type",x ="Year",y ="Median COD (mg/L)",color ="FSTP Type" ) +scale_color_viridis_d() +theme_minimal() +theme(plot.title =element_text(hjust =0.5) )
Figure 7: COD in Effluent by Year and FSTP Type
Figure 7 shows that COD levels from various FSTP technologies each year follow similar trend with 2024 and 2025 showing higher COD concentrations in all technologies. WSP has one of lowest COD levels in all years. The dashed horizontal line indicates the reference threshold of 125 mg/L for COD in treated effluent. None of the technologies are able to achieve that standard.
Code
ggplot( median_values,aes(x = Year, y = Ecoli, color = FSTP_type)) +geom_hline(yintercept =1000,linetype ="dashed",linewidth =1,color ="red" ) +annotate("text",x =min(median_values$Year),y =1000,label ="1000 cfu/100 ml threshold",hjust =0,vjust =-0.5,size =3.5,color ="red" ) +geom_line(linewidth =1.2) +geom_point(size =3) +labs(title ="E. coli Concentration in Effluent by Year and FSTP Type",x ="Year",y ="Median E. coli (cfu/100 ml)",color ="FSTP Type" ) +scale_color_viridis_d() +theme_minimal() +theme(plot.title =element_text(hjust =0.5) )
Figure 8: E. coli in Effluent by Year and FSTP Type
Figure 8 shows that E. coli levels from various FSTP technologies each year. All tech except UFF, BP and other follow similar trend. UFF and BP show sharp drop in E. coli in 2023 from 2022, dropping even lower in the following years to at or below recommended threshold. The dashed horizontal line indicates the reference threshold of 1000 cfu/100 ml for E. coli in treated effluent.
4. Conclusion
Looking at the Bangladesh Environmental Conservation Rules 2023 (BECR 2023) standards, BOD, COD and E. coli concentrations are very high than what is allowable at all technologies and all years.
Waste stabilization pond (WSP) seemed to be fairing better than other technologies throughout the years.
The location map shows the FSTP distances or use of technology did not follow any spatial pattern or depend on camp size. Further analysis, including population density might shed more light on how the technology was selected for each area.
Wickham, Hadley, Mara Averick, Jennifer Bryan, Winston Chang, Lucy D’Agostino McGowan, Romain François, Garrett Grolemund, et al. 2019. “Welcome to the Tidyverse” 4: 1686. https://doi.org/10.21105/joss.01686.
Wickham, Hadley, Davis Vaughan, and Maximilian Girlich. 2024. “Tidyr: Tidy Messy Data.”https://tidyr.tidyverse.org.