Development of Machine Learning Algorithms to Predict Urban Air Quality Index (Study Area: Tehran City)

1 MA., Faculty of Surveying and Geomatics Engineering, University of Tehran, Tehran, Iran

2 MA, Faculty of Surveying and Geomatics Engineering, University of Tehran, Tehran, Iran

3 Researcher, Young Researchers and Elite Club, Mashhad Branch of Islamic Azad University, Mashhad, Iran

4 MA, Faculty of Surveying Engineering, K. N. Toosi University of Technology,Tehran, Iran

5 Research Group of Drought and Climate Change،University of Birjand


Considering the harms of air pollution on human health and the environment, it seems necessary to reduce and solve this problem based on accurate knowledge of pollutants and criteria affecting it and identifying polluted areas. Therefore, using mathematical models in the form of machine learning is an optimal and cost-efficient approach to air pollution modeling. This research is applied in terms of purpose and its method is descriptive-analytical. The novelty of this research is presenting a new combination approach to determine the effective criteria for predicting the amount of air pollution. Therefore, the purpose of this study was to evaluate and compare the capabilities of two machine learning models, namely Support Vector Machine (SVM) and Random Forest (RF) in combination with Genetic Algorithm (GA) to predict air pollution in Tehran. The data used in this research include particulate matter and gaseous pollutants in Tehran in 2020, which was obtained from Tehran Traffic Control Company. MATLAB and ArcMap software were used to analyze the data. The value of coefficient of determination (R2) obtained from the combined RF-GA method was 0.997, which indicates the high compatibility of this model with the data of this study. Moreover, the Root Mean Square Error (RMSE) value from the combined RF-GA method was 0.153, which indicates high accuracy of this model. Based on the data obtained from Tehran Traffic Control Company, the results of the RF method indicate the appropriateness of selecting the model to estimate the amount of air pollution in Tehran.

