Exogenous Variables
+ + + +exogenous-variables.Rmd
1. Exogenous variables +
+Exogenous variables are external factors that provide additional +information about the behavior of the target variable in time series +forecasting. These variables, which are correlated with the target, can +significantly improve predictions. Examples of exogenous variables +include weather data, economic indicators, holiday markers, and +promotional sales.
+TimeGPT
allows you to include exogenous variables when
+generating a forecast. This vignette will show you how to include them.
+It assumes you have already set up your API key. If you haven’t done
+this, please read the Get
+Started vignette first.
2. Load data +
+For this vignette, we will use the electricity consumption dataset
+with exogenous variables included in nixtlar
. This dataset
+contains hourly prices from five different electricity markets, along
+with two exogenous variables related to the prices and binary variables
+indicating the day of the week.
+df_exo_vars <- nixtlar::electricity_exo_vars
+head(df_exo_vars)
+#> unique_id ds y Exogenous1 Exogenous2 day_0 day_1 day_2
+#> 1 BE 2016-10-22 00:00:00 70.00 49593 57253 0 0 0
+#> 2 BE 2016-10-22 01:00:00 37.10 46073 51887 0 0 0
+#> 3 BE 2016-10-22 02:00:00 37.10 44927 51896 0 0 0
+#> 4 BE 2016-10-22 03:00:00 44.75 44483 48428 0 0 0
+#> 5 BE 2016-10-22 04:00:00 37.10 44338 46721 0 0 0
+#> 6 BE 2016-10-22 05:00:00 35.61 44504 46303 0 0 0
+#> day_3 day_4 day_5 day_6
+#> 1 0 0 1 0
+#> 2 0 0 1 0
+#> 3 0 0 1 0
+#> 4 0 0 1 0
+#> 5 0 0 1 0
+#> 6 0 0 1 0
When using exogenous variables, you must provide their future values
+to cover the complete forecast horizon; otherwise, TimeGPT
+will result in an error. Ensure that the dates of the future exogenous
+variables exactly match the forecast horizon. For the electricity
+consumption dataset with exogenous variables, nixtlar
+provides their values for the next 24 steps ahead.
+future_exo_vars <- nixtlar::electricity_future_exo_vars
+head(future_exo_vars)
+#> unique_id ds Exogenous1 Exogenous2 day_0 day_1 day_2 day_3
+#> 1 BE 2016-12-31 00:00:00 64108 70318 0 0 0 0
+#> 2 BE 2016-12-31 01:00:00 62492 67898 0 0 0 0
+#> 3 BE 2016-12-31 02:00:00 61571 68379 0 0 0 0
+#> 4 BE 2016-12-31 03:00:00 60381 64972 0 0 0 0
+#> 5 BE 2016-12-31 04:00:00 60298 62900 0 0 0 0
+#> 6 BE 2016-12-31 05:00:00 60339 62364 0 0 0 0
+#> day_4 day_5 day_6
+#> 1 0 1 0
+#> 2 0 1 0
+#> 3 0 1 0
+#> 4 0 1 0
+#> 5 0 1 0
+#> 6 0 1 0
3. Forecast with exogenous variables +
+To generate a forecast with exogenous variables, use the
+nixtla_client_forecast
function as you would for forecasts
+without them. The only difference is that you must add the exogenous
+variables using the X_df
argument.
Keep in mind that the default names for the time and target columns
+are ds
and y
, respectively. If your time and
+target columns have different names, specify them with
+time_col
and target_col
. Since this dataset
+has multiple ids (one for every electricity market), you will need to
+specify the name of the column that contains these ids, which in this
+case is unique_id
. To do this, simply use
+id_col="unique_id"
.
+fcst_exo_vars <- nixtla_client_forecast(df_exo_vars, h = 24, id_col = "unique_id", X_df = future_exo_vars)
+#> Frequency chosen: H
+head(fcst_exo_vars)
+#> unique_id ds TimeGPT
+#> 1 BE 2016-12-31 00:00:00 74.54077
+#> 2 BE 2016-12-31 01:00:00 43.34429
+#> 3 BE 2016-12-31 02:00:00 44.42922
+#> 4 BE 2016-12-31 03:00:00 38.09440
+#> 5 BE 2016-12-31 04:00:00 37.38914
+#> 6 BE 2016-12-31 05:00:00 39.08574
For comparison, we will also generate a forecast without the +exogenous variables.
+
+df <- nixtlar::electricity # same dataset but without the exogenous variables
+
+fcst <- nixtla_client_forecast(df, h = 24, id_col = "unique_id")
+#> Frequency chosen: H
+head(fcst)
+#> unique_id ds TimeGPT
+#> 1 BE 2016-12-31 00:00:00 45.19045
+#> 2 BE 2016-12-31 01:00:00 43.24445
+#> 3 BE 2016-12-31 02:00:00 41.95839
+#> 4 BE 2016-12-31 03:00:00 39.79649
+#> 5 BE 2016-12-31 04:00:00 39.20453
+#> 6 BE 2016-12-31 05:00:00 40.10878
4. Plot TimeGPT forecast +
+nixtlar
includes a function to plot the historical data
+and any output from nixtla_client_forecast
,
+nixtla_client_historic
,
+nixtla_client_anomaly_detection
and
+nixtla_client_cross_validation
. If you have long series,
+you can use max_insample_length
to only plot the last N
+historical values (the forecast will always be plotted in full).
+nixtla_client_plot(df_exo_vars, fcst_exo_vars, id_col = "unique_id", max_insample_length = 500)
+#> Frequency chosen: H