Introduction
This is a school project that targeted to analysis and predict the ridership of CitiBike in NYC. During the project, we implemented Time Series Analysis, Network Analysis, and built several predictive models and found out the factors that affecting CitiBike such as time, location and user.
Methods
- Conducted time series analysis based on NYC CitiBike ridership data, determined its seasonal trends, and yielded a forecast of the ridership daily usage count.
- Developed network analysis based on the geographical data, the centrality test showed top stations are tourist attractions such as Manhattan Bridge and Empire State Building. Partitioned the stations into neighborhoods that match the boroughs of NYC.
- Built a neural network model and a set of tree models to predict usage counts. The best model gives a 0.72 R2 score.