Author Identifier (ORCID)

Hoda Khoshvaght: https://orcid.org/0000-0001-8766-419X

Amir Razmjou: https://orcid.org/0000-0002-3554-5129

Mehdi Khiadani: https://orcid.org/0000-0003-1703-9342

Abstract

Unlike previous studies that rely on high-frequency (15-min or hourly) datasets, this study is among the first to use low-frequency (weekly) data to evaluate the performance of linear and nonlinear machine learning (ML) algorithms for predicting biochemical oxygen demand (BOD) and ammonium nitrogen (NH4+-N) in the primary and secondary treatment effluents from the Subiaco Water Resource Recovery Facility (WRRF) in Western Australia. Various feature selection methods, including filters, wrappers, and embedded methods, were employed to identify the most effective approach that achieves the highest model performance while enhancing computational efficiency. The results demonstrate that a reduced set of key features can achieve comparable predictive accuracy with lower computational complexity. For BOD prediction in primary effluent, a multilayer perceptron (MLP) achieved a root mean square error (RMSE) of 23.50 mg per liter (mg/L) using features selected based on mutual information. In the secondary effluent, SVR (rbf) and random forest feature selection yielded the best predictions, achieving an RMSE of 3.26 mg/L. Similarly, for NH4+-N, multiple linear regression with backward elimination achieved an RMSE of 2.71 mg/L in the primary effluent. In comparison, a random forest with five key predictors achieved an RMSE of 1.51 mg/L in the secondary effluent, indicating high accuracy in NH4+-N prediction. These findings demonstrate that data-driven models can predict BOD and NH4+-N using low-frequency monitoring data, supporting supervisory-level operational decision-making in wastewater treatment plants and near-real-time wastewater quality assessment. Furthermore, generalization analysis indicates that linear models perform more consistently across multiple targets and evaluation metrics.

Keywords

Effluent management, machine learning prediction models, primary treatment effluent, supervised learning algorithms, wastewater treatment

Document Type

Journal Article

Date of Publication

5-15-2026

Volume

172

Publication Title

Engineering Applications of Artificial Intelligence

Publisher

Elsevier

School

Mineral Recovery Research Centre / School of Engineering

Funders

Australian Government Research Training Program Scholarship / Water Corporation of Western Australia

Creative Commons License

Creative Commons Attribution 4.0 License
This work is licensed under a Creative Commons Attribution 4.0 License.

Comments

Khoshvaght, H., Permala, R. R., Razmjou, A., & Khiadani, M. (2026). Machine learning approaches for predicting biochemical oxygen demand and ammonium nitrogen: A decade-long weekly field study at a full-scale water resource recovery facility. Engineering Applications of Artificial Intelligence, 172, 114349. https://doi.org/10.1016/j.engappai.2026.114349

Share

 
COinS
 

Link to publisher version (DOI)

10.1016/j.engappai.2026.114349