Elevating hourly PM2.5 forecasting in Istanbul, Türkiye: Leveraging ERA5 reanalysis and genetic algorithms in a comparative machine learning model analysis
View abstract on PubMed
Summary
This summary is machine-generated.Accurate PM2.5 prediction is crucial for urban air quality management. The Nonlinear Autoregressive with Exogenous Inputs (NARX) model, optimized with a genetic algorithm, demonstrated superior performance in predicting PM2.5 concentrations in Istanbul.
Area Of Science
- Environmental Science
- Data Science
- Urban Planning
Background
- Rapid urbanization and industrialization increase air pollution, posing severe health risks.
- Accurate prediction of fine particulate matter (PM2.5) is essential for effective urban air quality management.
- Istanbul, a densely populated city, faces complex air pollution challenges due to diverse emission sources.
Purpose Of The Study
- To assess and compare the predictive accuracy of various machine learning (ML) models for PM2.5 concentrations in Istanbul.
- To investigate the spatial variability of PM2.5 across different regions of Istanbul using high-resolution data.
- To identify key meteorological, land cover, and vegetation variables influencing PM2.5 concentrations.
Main Methods
- Utilized high-resolution ERA5 reanalysis data for grid-based spatial analysis of Istanbul.
- Compared Multiple Linear Regression (MLR), Extreme Gradient Boosting (XGBoost), Light Gradient Boosting (LGB), Random Forest (RF), and Nonlinear Autoregressive with Exogenous Inputs (NARX) models.
- Employed genetic algorithm optimization for the NARX model and Neighborhood Component Analysis (NCA) for variable selection.
Main Results
- The NARX model demonstrated superior performance (R-value: 0.89, RMSE: 5.24 μg/m³, MAE: 2.94 μg/m³), outperforming RF, LGB, XGBoost, and MLR.
- PM2.5 prediction accuracy varied seasonally, with better performance in autumn and winter.
- Highest accuracy was observed in Region-1 (R-value: 0.94), while Region-5 showed the lowest (R-value: 0.75).
Conclusions
- The optimized NARX model is robust for predicting PM2.5 in complex urban environments, even with limited monitoring data.
- The methodology offers potential for global application in similar urban air quality management contexts.
- Advanced data analysis techniques are vital for developing targeted pollution control strategies and public health policies.

