Implement a machine learning pipeline to predict whether the business in Chicago will survive their first 2 years
Get the full dataset
cd data
sh get_fullfiles.sh
All the packages' requirement is in the enviorment.yml
To clone the enviorment, simply run the following:
conda env create -f environment.yml
To activate the enviorment, simply run the following:
conda activate myenv
python setup.py install
py.test
python main.py --config ./cofigs/acs_geo.yml
In the configs file, there are different combination of features that from ACS, reported 311, reported Crime, business license that you can choose.
The results of the pipeline is saved in the output folder.
Under the performance foler, there would be csvs to keep all the performance of all models
Under the pr folder, there would be precison-recall graphs
Under the roc foler, there would be roc graphs
This project is licensed under the MIT License - see the LICENSE.md file for details
This project is the final project of machine learning for public policy in University of Chicago.
- Supervised by Professor Rayid Ghani
- Inspired by Satej