generated from databricks-industry-solutions/industry-solutions-blueprints
-
Notifications
You must be signed in to change notification settings - Fork 5
/
04_AutoML_churn_prediction.py
77 lines (56 loc) · 2.59 KB
/
04_AutoML_churn_prediction.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
# Databricks notebook source
# MAGIC %md This notebook is available at https://github.com/databricks-industry-solutions/graph-analytics-churn-prediction.
# COMMAND ----------
# MAGIC %md-sandbox
# MAGIC
# MAGIC ## Accelerating Churn model creation using Databricks Auto-ML
# MAGIC ### A glass-box solution that empowers data teams without taking away control
# MAGIC
# MAGIC Bootstraping new ML projects can still be long and inefficient.
# MAGIC
# MAGIC Instead of creating the same boilerplate for each new project, Databricks Auto-ML can automatically generate state of the art models for Classifications, regression, and forecast.
# MAGIC
# MAGIC
# MAGIC <img width="1000" src="https://github.com/QuentinAmbard/databricks-demo/raw/main/retail/resources/images/auto-ml-full.png"/>
# MAGIC
# MAGIC <img style="float: right" width="600" src="https://github.com/QuentinAmbard/databricks-demo/raw/main/retail/resources/images/churn-auto-ml.png"/>
# MAGIC
# MAGIC Models can be directly deployed, or instead leverage generated notebooks to boostrap projects with best-practices, saving you weeks of efforts.
# MAGIC
# MAGIC ### Using Databricks Auto ML with our Churn dataset
# MAGIC
# MAGIC Auto ML is available in the "Machine Learning" space. All we have to do is start a new Auto-ML experimentation and select the feature table we just created (`churn_features`)
# MAGIC
# MAGIC Our prediction target is the `churn` column.
# MAGIC
# MAGIC Click on Start, and Databricks will do the rest.
# MAGIC
# MAGIC While this is done using the UI, you can also leverage the [python API](https://docs.databricks.com/applications/machine-learning/automl.html#automl-python-api-1)
# COMMAND ----------
from databricks import automl, feature_store
# COMMAND ----------
catalog = "hive_metastore"
db_name = "telco"
# COMMAND ----------
fs = feature_store.FeatureStoreClient()
# COMMAND ----------
customer_features = fs.read_table(name=f"{catalog}.{db_name}.telco_churn_customer_features")
# COMMAND ----------
graph_features = fs.read_table(name=f"{catalog}.{db_name}.telco_churn_graph_features")
# COMMAND ----------
features = customer_features.join(graph_features, on='customer_id', how='left')
# COMMAND ----------
summary = automl.classify(
features,
target_col="churn",
exclude_frameworks=["sklearn","lightgbm"],
exclude_cols=["customer_id"],
primary_metric="roc_auc",
timeout_minutes=5
)
# COMMAND ----------
# MAGIC %md
# MAGIC ### Using the generated notebook to build our model
# MAGIC
# MAGIC Next step: [Explore the generated Auto-ML sample notebook]($./05_AutoML_generated_notebook)
# COMMAND ----------