«`html
Understanding the Target Audience
The target audience for this coding guide includes data scientists, machine learning engineers, and business analysts who are looking to enhance their forecasting capabilities using GluonTS. They are typically familiar with Python programming and have a basic understanding of time series analysis. Their pain points include:
- Difficulty in managing and comparing multiple forecasting models.
- Challenges in generating synthetic datasets for testing and validation.
- Need for effective evaluation metrics to assess model performance.
- Desire for clear visualizations to interpret forecasting results.
Their goals include:
- Building robust forecasting workflows that can handle various models.
- Improving accuracy and reliability in predictions.
- Gaining insights from data through advanced visualizations.
Interests often lie in the latest advancements in machine learning techniques and their applications in business contexts. They prefer clear, concise communication with practical examples that can be directly applied to their work.
A Coding Guide to Build Flexible Multi-Model Workflows in GluonTS with Synthetic Data, Evaluation, and Advanced Visualizations
This tutorial explores GluonTS from a practical perspective, focusing on generating complex synthetic datasets, preparing them, and applying multiple models in parallel. We emphasize how to work with diverse estimators in the same pipeline, handle missing dependencies gracefully, and produce usable results. By incorporating evaluation and visualization steps, we create a workflow that highlights how models can be trained, compared, and interpreted in a seamless process.
Importing Required Libraries
import numpy as np import pandas as pd import matplotlib.pyplot as plt from datetime import datetime, timedelta import warnings warnings.filterwarnings('ignore') from gluonts.dataset.pandas import PandasDataset from gluonts.dataset.split import split from gluonts.evaluation import make_evaluation_predictions, Evaluator from gluonts.dataset.artificial import ComplexSeasonalTimeSeries try: from gluonts.torch import DeepAREstimator TORCH_AVAILABLE = True except ImportError: TORCH_AVAILABLE = False try: from gluonts.mx import DeepAREstimator as MXDeepAREstimator from gluonts.mx import SimpleFeedForwardEstimator MX_AVAILABLE = True except ImportError: MX_AVAILABLE = False
We begin by importing the core libraries for data handling, visualization, and GluonTS utilities. We also set up conditional imports for PyTorch and MXNet estimators, allowing us to flexibly use whichever backend is available in our environment.
Creating Synthetic Datasets
def create_synthetic_dataset(num_series=50, length=365, prediction_length=30): """Generate synthetic multi-variate time series with trends, seasonality, and noise""" np.random.seed(42) series_list = [] for i in range(num_series): trend = np.cumsum(np.random.normal(0.1 + i*0.01, 0.1, length)) daily_season = 10 * np.sin(2 * np.pi * np.arange(length) / 7) yearly_season = 20 * np.sin(2 * np.pi * np.arange(length) / 365.25) noise = np.random.normal(0, 5, length) values = np.maximum(trend + daily_season + yearly_season + noise + 100, 1) dates = pd.date_range(start='2020-01-01', periods=length, freq='D') series_list.append(pd.Series(values, index=dates, name=f'series_{i}')) return pd.concat(series_list, axis=1)
We create a synthetic dataset where each series combines trend, seasonality, and noise. This design ensures consistent results with every run, returning a clean multi-series DataFrame ready for experimentation.
Initializing Forecasting Models
print("Creating synthetic multi-series dataset...") df = create_synthetic_dataset(num_series=10, length=200, prediction_length=30) dataset = PandasDataset(df, target=df.columns.tolist()) training_data, test_gen = split(dataset, offset=-60) test_data = test_gen.generate_instances(prediction_length=30) print("Initializing forecasting models...") models = {} if TORCH_AVAILABLE: try: models['DeepAR_Torch'] = DeepAREstimator( freq='D', prediction_length=30 ) print("PyTorch DeepAR loaded") except Exception as e: print(f"PyTorch DeepAR failed to load: {e}") if MX_AVAILABLE: try: models['DeepAR_MX'] = MXDeepAREstimator( freq='D', prediction_length=30, trainer=dict(epochs=5) ) print("MXNet DeepAR loaded") except Exception as e: print(f"MXNet DeepAR failed to load: {e}") try: models['FeedForward'] = SimpleFeedForwardEstimator( freq='D', prediction_length=30, trainer=dict(epochs=5) ) print("FeedForward model loaded") except Exception as e: print(f"FeedForward failed to load: {e}") if not models: print("Using artificial dataset with built-in models...") artificial_ds = ComplexSeasonalTimeSeries( num_series=10, prediction_length=30, freq='D', length_low=150, length_high=200 ).generate() training_data, test_gen = split(artificial_ds, offset=-60) test_data = test_gen.generate_instances(prediction_length=30)
We generate a 10-series dataset, wrap it into a GluonTS PandasDataset, and split it into training and test windows. We then initialize multiple estimators (PyTorch DeepAR, MXNet DeepAR, and FeedForward) when available, and fall back to a built-in artificial dataset if no backends load.
Training Models and Evaluating Performance
trained_models = {} all_forecasts = {} if models: for name, estimator in models.items(): print(f"Training {name} model...") try: predictor = estimator.train(training_data) trained_models[name] = predictor forecasts = list(predictor.predict(test_data.input)) all_forecasts[name] = forecasts print(f"{name} training completed!") except Exception as e: print(f"{name} training failed: {e}") continue print("Evaluating model performance...") evaluator = Evaluator(quantiles=[0.1, 0.5, 0.9]) evaluation_results = {} for name, forecasts in all_forecasts.items(): if forecasts: try: agg_metrics, item_metrics = evaluator(test_data.label, forecasts) evaluation_results[name] = agg_metrics print(f"\n{name} Performance:") print(f" MASE: {agg_metrics['MASE']:.4f}") print(f" sMAPE: {agg_metrics['sMAPE']:.4f}") print(f" Mean wQuantileLoss: {agg_metrics['mean_wQuantileLoss']:.4f}") except Exception as e: print(f"Evaluation failed for {name}: {e}")
We train each available estimator, collect probabilistic forecasts, and store the fitted predictors for reuse. We then evaluate results using MASE, sMAPE, and weighted quantile loss, providing a consistent, comparative view of model performance.
Advanced Visualizations of Forecasts
def plot_advanced_forecasts(test_data, forecasts_dict, series_idx=0): """Advanced plotting with multiple models and uncertainty bands""" fig, axes = plt.subplots(2, 2, figsize=(15, 10)) fig.suptitle('Advanced GluonTS Forecasting Results', fontsize=16, fontweight='bold') if not forecasts_dict: fig.text(0.5, 0.5, 'No successful forecasts to display', ha='center', va='center', fontsize=20) return fig if series_idx < len(test_data.label): ts_label = test_data.label[series_idx] ts_input = test_data.input[series_idx]['target'] colors = ['blue', 'red', 'green', 'purple', 'orange'] ax1 = axes[0, 0] ax1.plot(range(len(ts_input)), ts_input, 'k-', label='Historical', alpha=0.8, linewidth=2) ax1.plot(range(len(ts_input), len(ts_input) + len(ts_label)), ts_label, 'k--', label='True Future', alpha=0.8, linewidth=2) for i, (name, forecasts) in enumerate(forecasts_dict.items()): if series_idx < len(forecasts): forecast = forecasts[series_idx] forecast_range = range(len(ts_input), len(ts_input) + len(forecast.mean)) color = colors[i % len(colors)] ax1.plot(forecast_range, forecast.mean, color=color, label=f'{name} Mean', linewidth=2) try: ax1.fill_between(forecast_range, forecast.quantile(0.1), forecast.quantile(0.9), alpha=0.2, color=color, label=f'{name} 80% CI') except: pass ax1.set_title('Multi-Model Forecasts Comparison', fontsize=12, fontweight='bold') ax1.legend() ax1.grid(True, alpha=0.3) ax1.set_xlabel('Time Steps') ax1.set_ylabel('Value') ax2 = axes[0, 1] if all_forecasts: first_model = list(all_forecasts.keys())[0] if series_idx < len(all_forecasts[first_model]): forecast = all_forecasts[first_model][series_idx] ax2.scatter(ts_label, forecast.mean, alpha=0.7, s=60) min_val = min(min(ts_label), min(forecast.mean)) max_val = max(max(ts_label), max(forecast.mean)) ax2.plot([min_val, max_val], [min_val, max_val], 'r--', alpha=0.8) ax2.set_title(f'Prediction vs Actual - {first_model}', fontsize=12, fontweight='bold') ax2.set_xlabel('Actual Values') ax2.set_ylabel('Predicted Values') ax2.grid(True, alpha=0.3) ax3 = axes[1, 0] if all_forecasts: first_model = list(all_forecasts.keys())[0] if series_idx < len(all_forecasts[first_model]): forecast = all_forecasts[first_model][series_idx] residuals = ts_label - forecast.mean ax3.hist(residuals, bins=15, alpha=0.7, color='skyblue', edgecolor='black') ax3.axvline(x=0, color='r', linestyle='--', linewidth=2) ax3.set_title(f'Residuals Distribution - {first_model}', fontsize=12, fontweight='bold') ax3.set_xlabel('Residuals') ax3.set_ylabel('Frequency') ax3.grid(True, alpha=0.3) ax4 = axes[1, 1] if evaluation_results: metrics = ['MASE', 'sMAPE'] model_names = list(evaluation_results.keys()) x = np.arange(len(metrics)) width = 0.35 for i, model_name in enumerate(model_names): values = [evaluation_results[model_name].get(metric, 0) for metric in metrics] ax4.bar(x + i*width, values, width, label=model_name, color=colors[i % len(colors)], alpha=0.8) ax4.set_title('Model Performance Comparison', fontsize=12, fontweight='bold') ax4.set_xlabel('Metrics') ax4.set_ylabel('Value') ax4.set_xticks(x + width/2 if len(model_names) > 1 else x) ax4.set_xticklabels(metrics) ax4.legend() ax4.grid(True, alpha=0.3) else: ax4.text(0.5, 0.5, 'No evaluation\nresults available', ha='center', va='center', transform=ax4.transAxes, fontsize=14) plt.tight_layout() return fig
If all forecasts and test data are available, we create advanced visualizations to compare the results. This includes plotting the mean forecasts, actual values, residuals distribution, and model performance metrics.
Conclusion
In conclusion, we have established a robust setup that balances data creation, model experimentation, and performance analysis. This guide demonstrates how to adapt flexibly, test multiple options, and visualize results in a way that makes comparison intuitive. This foundation allows for experimentation with GluonTS and application of the same principles to real datasets while keeping the process modular and easy to extend.
Check out the FULL CODES here. Feel free to check out our GitHub Page for Tutorials, Codes, and Notebooks. Also, feel free to follow us on Twitter and don’t forget to join our 100k+ ML SubReddit and Subscribe to our Newsletter.
«`