This repository was created to document my progress and learnings from the book "Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow, 3rd Edition" by Aurélien Géron.
This repository contains code examples, exercises, and projects related to the concepts covered in the book.
You can find the official tutorials page here: https://colab.research.google.com/github/ageron/handson-ml3/blob/main/index.ipynb
And the official code examples here: https://github.com/ageron/handson-ml3
To ensure a consistent environment for working locally with Python, you can use pyenv
for managing Python versions and Poetry
for dependency management.
In the tutorials
directory you can find some basic instruction on how to install and use pyenv
and Poetry
.
Alternatively you can use Google Colab to write and execute python code through the browser.
It's also recommended to have a prior knowledge of the following Python libraries: NumPy
, Pandas
, and Matplotlib
. In the tutorials
directory, you'll also find some basic examples of usage for each of these libraries, as well as examples of required basic math (such as linear algebra) and other related materials.
In the projects
directory, you'll find the source code of the project developed in each chapter.
There are thousands of open datasets to choose from, ranging across all sorts of domains. Here are a few places you can look to get data:
Popular open data sources:
- OpenML.org
- Kaggle
- Hugging Face Datasets
- Paper with Code
- UC Irvine Machine Learning Repository
- Amazon's AWS datasets
- TensorFlow Datasets
- Google's Dataset Search
- Microsoft's Open Data
- Data.gov
- EU Open Data Portal
Meta portals:
- Data Portals
- Awesome Public Datasets
- Open Data Monitor
- Wikipedia's list of datasets
- Quora's list of datasets
- Reddit's r/datasets
This checklist can guide you through your machine learning projects. There are eight main steps:
- Frame the problem and look at the big picture.
- Get the data.
- Explore the data to gain insights.
- Prepare the data to better expose the underlying data patterns to machine learning algorithms.
- Explore many different models and shortlist the best ones.
- Fine-tune your models and combine them into a great solution.
- Present your solution.
- Launch, monitor, and maintain your system.