Taxi Service: Time-Series Demand Forecasting

Project details:

This project builds a time-series model that forecasts hourly taxi orders for a service operating in a major city. After resampling data, creating lag features and adding rolling statistics, several models were evaluated. The final solution delivers accurate short-term forecasts and helps the company plan driver allocation during high-demand periods.

Description

Business Context & Problem

Taxi services must balance supply and demand. Too few drivers lead to long wait times and cancelled rides; too many increase idle time and operational costs. Forecasting upcoming demand helps the company schedule drivers effectively. This project focuses on predicting hourly order volume using historical data.

Data & Analytical Approach

The dataset consisted of timestamped order counts. After converting it into a clean and regular hourly time series, seasonal patterns and weekly trends were explored. Feature engineering introduced lag features (previous hours’ demand), rolling averages and other time-dependent variables that help capture temporal structure. The data was then split chronologically to ensure realistic model evaluation.

Statistical / ML Analysis

Several regression models were tested, including linear models and gradient boosting. Since time-series prediction is sensitive to overfitting, cross-validation was performed using time-aware splits. LightGBM showed the strongest performance, handling non-linear patterns and interacting features well. Model quality was assessed with RMSE, matching the project’s required accuracy threshold.

At the end of the research, several models were evaluated. The results are summarized in the table:

Model nameRMSE TestRMSE TrainDepthEST
LinearRegression53.5157105115705830.22368381335574—-—-
DecisionTreeRegressor48.0873397235354813.84931741264713311—-
RandomForestRegressor43.800422681312658.6581642164221681995

Key Insights & Final Recommendations

The analysis revealed strong weekly seasonality and noticeable peaks during specific hours. Incorporating lag features and rolling statistics greatly improved forecast accuracy.
The final model enables the company to anticipate upcoming spikes in demand and allocate the right number of drivers at the right time, improving service quality and reducing unnecessary costs.

Scroll to Top