Clear Sky Science · en
Model generalization paradigms for predicting viral particles and evaluating removal efficiencies in anaerobic membrane bioreactor plants
Why cleaner recycled water matters
As cities search for new water sources in a warming, growing world, recycled wastewater is becoming part of the everyday tap. But even after advanced treatment, tiny viruses can slip through, raising concerns about health and safety. This study explores how artificial intelligence can act as a vigilant "soft sensor"—quietly watching treatment plants in real time to flag changes in viral contamination and confirm that reused water remains safe.

Making sense of a moving target
Wastewater treatment plants are anything but steady. The mix of household and industrial waste coming in changes by the hour, and the performance of filters and membranes can drift over time. Traditional lab methods to measure viruses in water are slow and labor-intensive: samples must be collected, transported, and analyzed, often days after the water has already been discharged or reused. That delay makes it hard for operators to react quickly if virus levels begin to rise. The authors focus on anaerobic membrane bioreactors—systems that clean wastewater using microorganisms and fine-pored membranes while also generating energy. These plants can remove many pathogens, but monitoring exactly how well they are doing, moment by moment, is a major challenge.
Teaching computers to watch for viruses
Instead of measuring viruses directly all the time, the team trained machine-learning models to infer viral levels from simple, readily available water quality readings such as pH, cloudiness, salt content, and nutrient levels. They worked with two anaerobic membrane plants in different Saudi Arabian cities: a municipal pilot plant at a university and a larger mixed municipal–industrial facility. To overcome the fact that only a small number of real samples had been analyzed for viruses, the researchers used three data "generators" to create realistic synthetic datasets that mimic the behavior of the real plants. These enriched datasets fed two advanced learning strategies: a "lifelong" model that continuously adapts as new data arrive, and an "attention" model that learns to focus on the most informative signals and time points when predicting virus concentrations.
Following viruses through the treatment train
The models were asked to predict the presence of several important viral targets, including human adenoviruses and common viral markers of fecal pollution, at different points in the treatment process. They then calculated the log removal value—a standard way to express how many times virus levels drop between raw sewage and treated effluent. Across both plants and multiple treatment stages, the virtual soft sensors closely matched laboratory measurements, often explaining more than 90% of the variation in virus levels. The systems correctly captured strong removal of adenovirus and pepper mild mottle virus, and more modest reductions in total virus counts. Crucially, they stayed accurate even when applied to data from a different plant than the one they were trained on, or when predicting performance in a different treatment step.

Adapting to new plants and changing conditions
A key achievement of this work is robustness. Wastewater from a university campus and from an industrial zone looks very different, yet the same modeling frameworks could be transferred between them with only modest adjustment. The lifelong learning approach excelled at continuously updating itself as new batches of data came in, without having to be retrained from scratch. The attention-based approach, meanwhile, highlighted which water quality signals and time windows mattered most for reliable prediction and could be reused on entirely new datasets. Both approaches handled the natural "drift" in plant behavior over time, suggesting they can keep up as operating conditions, influent mixtures, or even climate patterns change.
What this means for safer water reuse
For non-specialists, the bottom line is that this study brings us closer to practical, real-time virus monitoring in advanced wastewater treatment plants without needing constant, costly lab tests. By learning from easily measured water quality signals, these smart soft sensors can estimate virus levels and removal efficiency with high accuracy, alerting operators if performance slips and helping regulators verify that reclaimed water meets safety goals. As such tools are refined and expanded to more contaminants and plant types, they could become a cornerstone of safe, sustainable water reuse in water-scarce regions around the world.
Citation: Chen, J., N’Doye, I., Sanchez Medina, J. et al. Model generalization paradigms for predicting viral particles and evaluating removal efficiencies in anaerobic membrane bioreactor plants. npj Emerg. Contam. 2, 10 (2026). https://doi.org/10.1038/s44454-026-00030-8
Keywords: wastewater reuse, virus monitoring, machine learning, membrane bioreactors, water quality