Forecasting with Chronos¶
Feng Li¶
Guanghua School of Management¶
Peking University¶
feng.li@gsm.pku.edu.cn¶
Course home page: https://feng.li/forecasting-with-ai¶
Chronos-2 Basics¶
Chronos-2 is a foundation model for time series forecasting that builds on Chronos and Chronos-Bolt. It offers significant improvements in capabilities and can handle diverse forecasting scenarios not supported by earlier models.
| Capability | Chronos | Chronos-Bolt | Chronos-2 |
|---|---|---|---|
| Univariate Forecasting | ✅ | ✅ | ✅ |
| Cross-learning across items | ❌ | ❌ | ✅ |
| Multivariate Forecasting | ❌ | ❌ | ✅ |
| Past-only (real/categorical) covariates | ❌ | ❌ | ✅ |
| Known future (real/categorical) covariates | 🧩 | 🧩 | ✅ |
| Fine-tuning support | ✅ | ✅ | ✅ |
| Max. Context Length | 512 | 2048 | 8192 |
🧩 Chronos/Chronos-Bolt do not natively support future covariates, but they can be combined with external covariate regressors (see AutoGluon tutorial). This only models per-timestep effects, not effects across time. In contrast, Chronos-2 supports all covariate types natively.
More details about Chronos-2 are available in the technical report.
# pip install -U "chronos-forecasting>=2.0" "pandas[pyarrow]" "matplotlib" --break-system-packages
from chronos import BaseChronosPipeline, Chronos2Pipeline
# https://huggingface.co/amazon/chronos-2
LOCAL_MODEL_DIR = "../data/chronos-2" # Your offline pretrained time series foundation model
pipeline: Chronos2Pipeline = BaseChronosPipeline.from_pretrained(
LOCAL_MODEL_DIR, device_map='cpu'
)
print("Loaded Chronos-2 from local dir on", 'cpu')
2025-11-14 11:24:09.328586: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:485] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered 2025-11-14 11:24:09.345621: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:8454] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered 2025-11-14 11:24:09.350534: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1452] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered 2025-11-14 11:24:09.363797: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations. To enable the following instructions: AVX2 AVX512F FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags. 2025-11-14 11:24:10.185523: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
Loaded Chronos-2 from local dir on cpu
Univariate Forecasting¶
We start with a simple univariate forecasting example using the pandas API.
# Load data as a long-format pandas data frame
import pandas as pd
context_df = pd.read_csv("../data/m4_hourly_train.csv")
context_df
| item_id | timestamp | target | |
|---|---|---|---|
| 0 | H1 | 1750-01-01 00:00:00 | 605.0 |
| 1 | H1 | 1750-01-01 01:00:00 | 586.0 |
| 2 | H1 | 1750-01-01 02:00:00 | 586.0 |
| 3 | H1 | 1750-01-01 03:00:00 | 559.0 |
| 4 | H1 | 1750-01-01 04:00:00 | 511.0 |
| ... | ... | ... | ... |
| 353495 | H414 | 1750-02-09 19:00:00 | 48.0 |
| 353496 | H414 | 1750-02-09 20:00:00 | 41.0 |
| 353497 | H414 | 1750-02-09 21:00:00 | 35.0 |
| 353498 | H414 | 1750-02-09 22:00:00 | 26.0 |
| 353499 | H414 | 1750-02-09 23:00:00 | 17.0 |
353500 rows × 3 columns
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
from matplotlib.backends.backend_pdf import PdfPages
def plot_multiple_items_random(context_df, pdf_path,
n_items=20,
timestamp_col="timestamp",
target_col="target",
recent_n=200):
# ---- 1. Randomly select 20 item_ids ----
unique_ids = context_df["item_id"].unique()
selected_ids = np.random.choice(unique_ids, size=n_items, replace=False)
print("Selected item_ids:", selected_ids)
fig_size = (12, 4)
with PdfPages(pdf_path) as pdf:
for item_id in selected_ids:
df = context_df[context_df["item_id"] == item_id].copy()
df[timestamp_col] = pd.to_datetime(df[timestamp_col])
# ---- 200 most recent ----
df_recent = df.tail(recent_n).reset_index(drop=True)
x = df_recent[timestamp_col]
y = df_recent[target_col]
# 90% cutoff
n = len(df_recent)
cutoff_idx = int(n * 0.9)
x_cut = x.iloc[cutoff_idx - 1] # cutoff timestamp
# =====================================================
# A. FULL 200 + vertical bar
# =====================================================
figA, axA = plt.subplots(figsize=fig_size)
axA.plot(x, y, linewidth=1.2)
# Dense grid
axA.grid(which="major", linestyle="--", alpha=0.6)
axA.grid(which="minor", linestyle=":", alpha=0.4)
axA.minorticks_on()
# ---- Vertical bar at 90% point ----
axA.axvline(x_cut, color="red", linestyle="--", linewidth=1)
axA.set_title(f"[{item_id}] Recent 200 Points (with 90% marker)")
axA.set_xlabel("Timestamp")
axA.set_ylabel("Target")
figA.tight_layout()
# Save axis limits for B
xlim_A = axA.get_xlim()
ylim_A = axA.get_ylim()
pdf.savefig(figA)
plt.close(figA)
# =====================================================
# B. FIRST 90% + vertical bar
# =====================================================
figB, axB = plt.subplots(figsize=fig_size)
axB.plot(x[:cutoff_idx], y[:cutoff_idx], linewidth=1.2)
# Use A's limits → blank right region
axB.set_xlim(xlim_A)
axB.set_ylim(ylim_A)
# Dense grid
axB.grid(which="major", linestyle="--", alpha=0.6)
axB.grid(which="minor", linestyle=":", alpha=0.4)
axB.minorticks_on()
# ---- Vertical bar at END of B (same location) ----
axB.axvline(x_cut, color="red", linestyle="--", linewidth=1)
axB.set_title(f"[{item_id}] Left 90% (with boundary marker)")
axB.set_xlabel("Timestamp")
axB.set_ylabel("Target")
figB.tight_layout()
pdf.savefig(figB)
plt.close(figB)
print(f"Saved combined PDF: {pdf_path}")
plot_multiple_items_random(
context_df,
pdf_path="random100_items.pdf",
n_items=100
)
Selected item_ids: ['H162' 'H34' 'H359' 'H197' 'H101' 'H302' 'H141' 'H194' 'H254' 'H60' 'H398' 'H315' 'H11' 'H169' 'H260' 'H191' 'H215' 'H8' 'H30' 'H183' 'H310' 'H159' 'H106' 'H223' 'H410' 'H282' 'H333' 'H381' 'H125' 'H78' 'H85' 'H294' 'H190' 'H109' 'H127' 'H387' 'H345' 'H251' 'H384' 'H372' 'H357' 'H314' 'H116' 'H94' 'H369' 'H15' 'H403' 'H173' 'H1' 'H92' 'H168' 'H108' 'H213' 'H279' 'H334' 'H44' 'H208' 'H241' 'H319' 'H328' 'H349' 'H93' 'H367' 'H386' 'H389' 'H375' 'H67' 'H54' 'H91' 'H63' 'H210' 'H76' 'H204' 'H336' 'H110' 'H27' 'H160' 'H198' 'H229' 'H317' 'H231' 'H339' 'H330' 'H340' 'H269' 'H259' 'H181' 'H118' 'H117' 'H171' 'H301' 'H298' 'H200' 'H355' 'H29' 'H248' 'H66' 'H138' 'H61' 'H385'] Saved combined PDF: random100_items.pdf
import matplotlib.pyplot as plt
# Count number of unique time series and number of observations per item_id
n_unique_series = context_df['item_id'].nunique()
lengths = context_df.groupby('item_id')['timestamp'].count()
# Plot the distribution
plt.figure(figsize=(6, 4))
plt.hist(lengths)
plt.title(f'Time Series Lengths of Total {n_unique_series} Series')
plt.xlabel('Number of Observations per Time Series')
plt.ylabel('Counts')
plt.grid(alpha=0.3)
plt.show()
/home/fli/.local/lib/python3.12/site-packages/matplotlib/projections/__init__.py:63: UserWarning: Unable to import Axes3D. This may be due to multiple versions of Matplotlib being installed (e.g. as a system package and as a pip package). As a result, the 3D projection is not available.
warnings.warn("Unable to import Axes3D. This may be due to multiple versions of "
pred_df = pipeline.predict_df(context_df, id_column="item_id", target="target", timestamp_column="timestamp",
prediction_length=24, quantile_levels=[0.1, 0.9])
pred_df
/usr/local/lib/python3.12/dist-packages/torch/utils/data/dataloader.py:665: UserWarning: 'pin_memory' argument is set as true but no accelerator is found, then device pinned memory won't be used. warnings.warn(warn_msg)
| item_id | timestamp | target_name | predictions | 0.1 | 0.9 | |
|---|---|---|---|---|---|---|
| 0 | H1 | 1750-01-30 04:00:00 | target | 624.867920 | 611.385071 | 638.598694 |
| 1 | H1 | 1750-01-30 05:00:00 | target | 563.703125 | 546.655029 | 578.665649 |
| 2 | H1 | 1750-01-30 06:00:00 | target | 521.589905 | 505.747437 | 537.950806 |
| 3 | H1 | 1750-01-30 07:00:00 | target | 489.910706 | 473.671814 | 508.854126 |
| 4 | H1 | 1750-01-30 08:00:00 | target | 471.144501 | 452.199371 | 491.050354 |
| ... | ... | ... | ... | ... | ... | ... |
| 9931 | H414 | 1750-02-10 19:00:00 | target | 61.697693 | 49.787407 | 75.447968 |
| 9932 | H414 | 1750-02-10 20:00:00 | target | 52.210609 | 41.923550 | 65.115601 |
| 9933 | H414 | 1750-02-10 21:00:00 | target | 46.259827 | 36.681183 | 55.795212 |
| 9934 | H414 | 1750-02-10 22:00:00 | target | 33.600002 | 26.682114 | 41.483673 |
| 9935 | H414 | 1750-02-10 23:00:00 | target | 22.696373 | 17.629684 | 26.969643 |
9936 rows × 6 columns
Retail Demand Forecasting¶
Forecast next quarter's weekly store sales using historical sales, historical customer footfall (Customers), and known covariates indicating store operation (Open), promotion periods (Promo), and holidays (SchoolHoliday, StateHoliday).
import matplotlib.pyplot as plt
# Visualization helper function
def plot_forecast(
context_df: pd.DataFrame,
pred_df: pd.DataFrame,
test_df: pd.DataFrame,
target_column: str,
timeseries_id: str,
id_column: str = "id",
timestamp_column: str = "timestamp",
history_length: int = 256,
title_suffix: str = "",
):
# Simple type correction
timeseries_id = int(timeseries_id)
context_df[timestamp_column] = pd.to_datetime(context_df[timestamp_column])
pred_df[timestamp_column] = pd.to_datetime(pred_df[timestamp_column])
test_df[timestamp_column] = pd.to_datetime(test_df[timestamp_column])
ts_context = context_df.query(f"{id_column} == @timeseries_id").set_index(timestamp_column)[target_column]
ts_pred = pred_df.query(f"{id_column} == @timeseries_id and target_name == @target_column").set_index(
timestamp_column
)[["0.1", "predictions", "0.9"]]
ts_ground_truth = test_df.query(f"{id_column} == @timeseries_id").set_index(timestamp_column)[target_column]
last_date = ts_context.index.max()
start_idx = max(0, len(ts_context) - history_length)
print(start_idx)
plot_cutoff = ts_context.index[start_idx]
ts_context = ts_context[ts_context.index >= plot_cutoff]
ts_pred = ts_pred[ts_pred.index >= plot_cutoff]
ts_ground_truth = ts_ground_truth[ts_ground_truth.index >= plot_cutoff]
fig = plt.figure(figsize=(12, 3))
ax = fig.gca()
ts_context.plot(ax=ax, label=f"historical {target_column}", color="xkcd:azure")
ts_ground_truth.plot(ax=ax, label=f"future {target_column} (ground truth)", color="xkcd:grass green")
ts_pred["predictions"].plot(ax=ax, label="forecast", color="xkcd:violet")
ax.fill_between(
ts_pred.index,
ts_pred["0.1"],
ts_pred["0.9"],
alpha=0.7,
label="prediction interval",
color="xkcd:light lavender",
)
ax.axvline(x=last_date, color="black", linestyle="--", alpha=0.5)
ax.legend(loc="upper left")
ax.set_title(f"{target_column} forecast for {timeseries_id} {title_suffix}")
fig.show()
# Retail forecasting configuration
target = "Sales" # Column name containing sales values to forecast
prediction_length = 13 # Number of days to forecast ahead
id_column = "id" # Column identifying different products/stores
timestamp_column = "timestamp" # Column containing datetime information
timeseries_id = "1" # Specific time series to visualize (product/store ID)
# Load historical sales and past values of covariates
# sales_context_df = pd.read_parquet("../data/retail_sales_train.parquet")
sales_context_df = pd.read_csv("../data/retail_sales_train.csv")
sales_context_df
| id | timestamp | Sales | Open | Promo | SchoolHoliday | StateHoliday | Customers | |
|---|---|---|---|---|---|---|---|---|
| 0 | 1 | 2013-01-13 | 32952.0 | 0.857143 | 0.714286 | 5.0 | 0.0 | 3918.0 |
| 1 | 1 | 2013-01-20 | 25978.0 | 0.857143 | 0.000000 | 0.0 | 0.0 | 3417.0 |
| 2 | 1 | 2013-01-27 | 33071.0 | 0.857143 | 0.714286 | 0.0 | 0.0 | 3862.0 |
| 3 | 1 | 2013-02-03 | 28693.0 | 0.857143 | 0.000000 | 0.0 | 0.0 | 3561.0 |
| 4 | 1 | 2013-02-10 | 35771.0 | 0.857143 | 0.714286 | 0.0 | 0.0 | 4094.0 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 133795 | 999 | 2015-03-29 | 43358.0 | 0.857143 | 0.000000 | 0.0 | 0.0 | 3252.0 |
| 133796 | 999 | 2015-04-05 | 69663.0 | 0.714286 | 0.714286 | 2.0 | 1.0 | 4424.0 |
| 133797 | 999 | 2015-04-12 | 35267.0 | 0.714286 | 0.000000 | 5.0 | 1.0 | 2771.0 |
| 133798 | 999 | 2015-04-19 | 63849.0 | 0.857143 | 0.714286 | 0.0 | 0.0 | 4230.0 |
| 133799 | 999 | 2015-04-26 | 39247.0 | 0.857143 | 0.000000 | 0.0 | 0.0 | 3027.0 |
133800 rows × 8 columns
# Load future values of covariates
# sales_test_df = pd.read_parquet("../data/retail_sales_test.parquet")
sales_test_df = pd.read_csv("../data/retail_sales_test.csv")
sales_future_df = sales_test_df.drop(columns=target)
sales_future_df
| id | timestamp | Open | Promo | SchoolHoliday | StateHoliday | |
|---|---|---|---|---|---|---|
| 0 | 1 | 2015-05-03 | 0.714286 | 0.714286 | 0.0 | 1.0 |
| 1 | 1 | 2015-05-10 | 0.857143 | 0.714286 | 0.0 | 0.0 |
| 2 | 1 | 2015-05-17 | 0.714286 | 0.000000 | 0.0 | 1.0 |
| 3 | 1 | 2015-05-24 | 0.857143 | 0.714286 | 0.0 | 0.0 |
| 4 | 1 | 2015-05-31 | 0.714286 | 0.000000 | 0.0 | 1.0 |
| ... | ... | ... | ... | ... | ... | ... |
| 14490 | 999 | 2015-06-28 | 0.857143 | 0.000000 | 0.0 | 0.0 |
| 14491 | 999 | 2015-07-05 | 0.857143 | 0.714286 | 0.0 | 0.0 |
| 14492 | 999 | 2015-07-12 | 0.857143 | 0.000000 | 0.0 | 0.0 |
| 14493 | 999 | 2015-07-19 | 0.857143 | 0.714286 | 5.0 | 0.0 |
| 14494 | 999 | 2015-07-26 | 0.857143 | 0.000000 | 5.0 | 0.0 |
14495 rows × 6 columns
# forecast without covariates
sales_pred_no_cov_df = pipeline.predict_df(
sales_context_df[[id_column, timestamp_column, target]],
future_df=None,
prediction_length=prediction_length,
quantile_levels=[0.1, 0.9],
id_column=id_column,
timestamp_column=timestamp_column,
target=target,
)
sales_pred_no_cov_df
/usr/local/lib/python3.12/dist-packages/torch/utils/data/dataloader.py:665: UserWarning: 'pin_memory' argument is set as true but no accelerator is found, then device pinned memory won't be used. warnings.warn(warn_msg)
| id | timestamp | target_name | predictions | 0.1 | 0.9 | |
|---|---|---|---|---|---|---|
| 0 | 1 | 2015-05-03 | Sales | 28868.748047 | 24831.210938 | 32377.041016 |
| 1 | 1 | 2015-05-10 | Sales | 23496.941406 | 20528.724609 | 28279.544922 |
| 2 | 1 | 2015-05-17 | Sales | 28338.820312 | 23889.726562 | 32065.578125 |
| 3 | 1 | 2015-05-24 | Sales | 24089.554688 | 20643.382812 | 29365.500000 |
| 4 | 1 | 2015-05-31 | Sales | 27662.677734 | 22583.529297 | 31700.175781 |
| ... | ... | ... | ... | ... | ... | ... |
| 14490 | 999 | 2015-06-28 | Sales | 53684.523438 | 38193.703125 | 66396.875000 |
| 14491 | 999 | 2015-07-05 | Sales | 56915.792969 | 38879.417969 | 66857.273438 |
| 14492 | 999 | 2015-07-12 | Sales | 53435.722656 | 37711.125000 | 66004.101562 |
| 14493 | 999 | 2015-07-19 | Sales | 56994.824219 | 38699.746094 | 66507.718750 |
| 14494 | 999 | 2015-07-26 | Sales | 51786.867188 | 37531.007812 | 65674.789062 |
14495 rows × 6 columns
# Generate predictions with covariates
sales_pred_df = pipeline.predict_df(
sales_context_df,
future_df=sales_future_df,
prediction_length=prediction_length,
quantile_levels=[0.1, 0.9],
id_column=id_column,
timestamp_column=timestamp_column,
target=target,
)
sales_pred_df
/usr/local/lib/python3.12/dist-packages/torch/utils/data/dataloader.py:665: UserWarning: 'pin_memory' argument is set as true but no accelerator is found, then device pinned memory won't be used. warnings.warn(warn_msg)
| id | timestamp | target_name | predictions | 0.1 | 0.9 | |
|---|---|---|---|---|---|---|
| 0 | 1 | 2015-05-03 | Sales | 28939.392578 | 25214.275391 | 32411.097656 |
| 1 | 1 | 2015-05-10 | Sales | 25541.919922 | 21921.328125 | 29191.929688 |
| 2 | 1 | 2015-05-17 | Sales | 23640.240234 | 20500.343750 | 26884.666016 |
| 3 | 1 | 2015-05-24 | Sales | 26778.261719 | 23318.353516 | 30162.820312 |
| 4 | 1 | 2015-05-31 | Sales | 22679.357422 | 19722.281250 | 25990.042969 |
| ... | ... | ... | ... | ... | ... | ... |
| 14490 | 999 | 2015-06-28 | Sales | 40080.484375 | 34807.414062 | 47214.660156 |
| 14491 | 999 | 2015-07-05 | Sales | 68556.195312 | 61109.796875 | 75537.335938 |
| 14492 | 999 | 2015-07-12 | Sales | 40855.218750 | 35225.363281 | 48365.933594 |
| 14493 | 999 | 2015-07-19 | Sales | 66134.984375 | 59100.578125 | 73635.484375 |
| 14494 | 999 | 2015-07-26 | Sales | 39742.546875 | 34394.359375 | 46101.925781 |
14495 rows × 6 columns
plot_forecast(
sales_context_df,
sales_pred_no_cov_df,
sales_test_df,
target_column=target,
timeseries_id=timeseries_id,
title_suffix="(without covariates)",
)
0
# Visualize forecast with covariates
plot_forecast(
sales_context_df,
sales_pred_df,
sales_test_df,
target_column=target,
timeseries_id=timeseries_id,
title_suffix="(with covariates)",
)
0