-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multi-class Classification examples #28
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: mshtelma <[email protected]>
…dded hyperparameter tuning to the multi-class example
@@ -1,4 +1,4 @@ | |||
mlflow>=2.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason we removed the mlflow requirement?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, good catch! Added it back.
ipykernel>=6.12 | ||
ipython>=7.32 | ||
flaml |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need flaml overall? Or do you think it is a good idea to add it in the notebook or create a sub requirement for this? Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point. I think there are other examples that use FLAML by default (at least I saw some PRs), so this might be a good idea. This is not a big dependency. I am happy to move it inside the multi-class folder as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I recall flaml is problematic to install with lightgbm installation difficult. Has this been fixed? If not, prefer to not including it. It's not used in this example anyway.
@@ -0,0 +1,26 @@ | |||
# Binary classification: Is this bottle of wine red or white? | |||
This is the root directory for an example project for the | |||
[MLflow Classification Recipe](https://mlflow.org/docs/latest/recipes.html#classification-recipe). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[MLflow Classification Recipe] --> [MLflow Multi Class Classification Recipe]
What do you think?
# COMMAND ---------- | ||
|
||
# MAGIC %pip install -r ../../requirements.txt | ||
# MAGIC %pip install git+https://github.com/mshtelma/mlflow.git@multiclassclassification |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove this once MLFlow is released.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We just released MLflow 2.1.0 so we should be good to remove this now :)
# COMMAND ---------- | ||
|
||
r.run("split") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we do a bit of EDA here before proceeding to split?
|
||
# COMMAND ---------- | ||
|
||
trained_model = r.get_artifact("model") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use the model to make a prediction on one hand-generated input example here?
# For different options please read: https://github.com/mlflow/recipes-classification-template#ingest-step | ||
using: csv | ||
loader_method: load_file_as_dataframe | ||
location: "./data/iris.csv" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this be an open access Databricks Delta table? Does it exist?
@@ -0,0 +1,31 @@ | |||
experiment: | |||
name: "sklearn_classification_experiment" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update the experiment name to not collide with binary classification example
INGEST_SCORING_CONFIG: | ||
# For different options please read: https://github.com/mlflow/recipes-classification-template#batch-scoring | ||
using: csv | ||
location: "./data/iris.csv" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'd be nice to use a different dataset for scoring, if such dataset exists. Or you can create one. See https://github.com/mlflow/recipes-examples/blob/main/classification/profiles/local.yaml#L26 for example.
import pandas | ||
|
||
df = pandas.read_csv(file_path, sep=",") | ||
df["class"] = df["class"].astype("category").cat.codes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does this line do? What happens if there's no such conversion?
transformers. | ||
""" | ||
|
||
return None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a comment indicating this will result in an identical transformer.
:return: A Series indicating whether each row should be filtered | ||
""" | ||
|
||
return Series(True, index=dataset.index) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a comment to indicate this doesn't filter out anything.
@@ -0,0 +1,10 @@ | |||
from steps.transform import transformer_fn |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why introduce an empty file multi-class-classification/tests/train_test.py
above?
ipykernel>=6.12 | ||
ipython>=7.32 | ||
flaml |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I recall flaml is problematic to install with lightgbm installation difficult. Has this been fixed? If not, prefer to not including it. It's not used in this example anyway.
No description provided.