Haku

Weather impact on public transport ridership : Predicting passenger load using machine learning

QR-koodi

Weather impact on public transport ridership : Predicting passenger load using machine learning

The purpose of this thesis is to investigate the correlation between the weather and number of bus passengers in a mid-sized city in the Nordics. The target was to build a model that could be used to predict the bus ridership based on weather forecasts. This data could be used to optimize the bus capacity.

The passenger data was collected using Automatic Passenger Counting solution provided by the company FARA. Weather data was collected from public sources. The data for the study was collected between 20 July 2022 to 20 August 2023. The main weather features this thesis focused on were temperature, snow, and rain. The correlation between weather and the number of passengers was analyzed using linear regression, gradient boosting, and machine learning models. Python library Scikit-learn was used for linear regression and machine learning. For gradient boosting, the Python library XGBoost was used.

Linear regression, both single-variable and multivariable, turned out to be inaccurate in predicting the number of passengers. Linear regression models had accuracies below 30%. Using machine learning, predicting the exact, or even close to exact number of passengers turned out to be difficult. However, when the number of passengers was divided into four or five different categories, the results were reasonably accurate. Especially, the extreme ends of the scale were categorized well. The results showed that on colder days the number of passengers grew. Also, the rain has a slight increasing effect on the bus ridership. When predicting the number of passengers using XGBoost, temperature was the most significant feature, followed by snow depth. Model accuracy was below 70%, which is average.

The impact of temperature was more severe on weekdays (from Monday to Friday) than it was on weekends. It was not clear if there is causality between weather and the number of passengers. The changes in the number of passengers were small. Using the model created to optimize bus capacity might not be reasonable. Without a further study on the data the current machine learning model is not suitable for commercial use, but the model provides a good base for fine tuning and improvements.

Tallennettuna:
Kysy apua / Ask for help

Sisältöä ei voida näyttää

Chat-sisältöä ei voida näyttää evästeasetusten vuoksi. Nähdäksesi sisällön sinun tulee sallia evästeasetuksista seuraavat: Chat-palveluiden evästeet.

Evästeasetukset