Our solution to the data science hackathon by Yassir, by our team indus.csv, which was ranked 3th on public leaderboard and 2nd on private leaderboard.
Established in 2017, Yassir is the leading ride-hailing company in Algeria. It covers all major Algerian cities and is expanding its services to Tunisia, Morocco and France. Besides ride-hailing services, Yassir is making customers’ lives easier by providing diversified services such as goods and food delivery as well as telemedicine.
The objective of this hackathon is to predict the estimated time of arrival at the dropoff point for a single Yassir journey. Our model used more than 50 features, some of these features were created by clustering the geographical data (latitude and longitud) into 300 clusters.
At the end, we used a weighted average of fine tuned XGBoost, Catboost and a default Random Forest, the model scored around 140 seconds in local cross validation and 148.51 seconds on final leaderboard (private LB), it is great to think that our model is 2 min and 20 seconds off in average.
We used several open source python packages and frameworks: Pandas, Numpy, SciKit Learn, Catboost, LightGBM, Random Forest, plotly and Matplotlib.
- MELLAK Rostane Mohamed Zakari - Industrial Engineering student at Polytechnic School of Algiers - (ENP Alger)
- SOUAMES Mohamed Annis - Industrial Engineering student at Polytechnic School of Algiers - (ENP Alger)
- MOHAMMEDI Larbi Abderrahmane - Industrial Engineering student at Polytechnic School of Algiers - (ENP Alger)
- MEBREK Brahim - Industrial Engineering student at Polytechnic School of Algiers - (ENP Alger)
We would like to thank the organizers and all the participants of our school for their great participation in this hackathon.