Nixtla unveiled StatsForecast 1.7.5, a significant update bringing new features and enhancements that further solidify its position as a leading tool for univariate time series forecasting. This release introduces the innovative MFLES model and a convenient wrapper for scikit-learn models, allowing users to leverage exogenous features easily.
One of the standout features of this release is the addition of the MFLES (Median Fourier Linear Exponential Smoothing) model, contributed by Tyler Blume. This model stands out for its excellent performance, speed, and versatility, supporting exogenous features and handling multiple seasonalities with aplomb. The MFLES model is based on gradient-boosted time Series Decomposition, integrating traditional decomposition techniques as the base estimator in the boosting process. It derives its name from the underlying estimators: Median, Fourier terms, Linear trends, and Exponential Smoothing. This combination allows the MFLES model to offer robust and accurate forecasting, making it a valuable addition to the StatsForecast arsenal.
The new release also includes a wrapper for scikit-learn models, enabling users to utilize the rich feature engineering capabilities of scikit-learn in their time series forecasting tasks. The `statsforecast.models.SklearnModel` wrapper allows training one model per series, which can be more effective than a single global model in certain scenarios. This integration offers flexibility and enhances the modeling power of StatsForecast, making it easier to incorporate external variables like weather or prices into forecasting models.
StatsForecast addresses the limitations of existing Python alternatives for statistical models, which are often slow, inaccurate, and not scalable. Designed for high performance and scalability, StatsForecast can efficiently fit millions of time series, making it suitable for production environments and benchmarking purposes.
Key Features and Performance of StatsForecast 1.7.5 include:
Automatic Forecasting: StatsForecast includes automatic tools like AutoARIMA, AutoETS, AutoCES, and AutoTheta, which search for the best parameters and models for a group of time series. These tools are optimized for performance, ensuring fast and accurate results.
Model Variety: From ARIMA and Theta families to models for multiple seasonalities and GARCH/ARCH models, StatsForecast covers a wide range of forecasting needs.
Speed and Efficiency: The library is 20x faster than pmdarima, 1.5x faster than R, and significantly faster than other popular tools like Prophet and statsmodels. By using numba to compile high-performance machine code, StatsForecast sets a new standard for speed and efficiency.
Compatibility and Integration: Out-of-the-box compatibility with Spark, Dask, and Ray allows seamless integration into various data processing pipelines. The library also supports probabilistic forecasting, confidence intervals, anomaly detection, and exogenous variables.
User-Friendly Syntax: With familiar sklearn-like syntax, StatsForecast offers an intuitive interface for fitting and predicting time series models, making it accessible to users of all levels.
Installing StatsForecast is straightforward. It can be installed using pip or conda:
pip install statsforecast
conda install -c conda-forge statsforecast
For a quick start, the following example demonstrates fitting and predicting with the AutoARIMA model:
from statsforecast import StatsForecast
from statsforecast.models import AutoARIMA
from statsforecast.utils import AirPassengersDF
df = AirPassengersDF
sf = StatsForecast(models=[AutoARIMA(season_length=12)], freq='M')
sf.fit(df)
sf.predict(h=12, level=[95])
Examples and Guides of StarForecast:
- End-to-end Walkthrough: Model training, evaluation, and selection for multiple time series.
- Anomaly Detection: Detect anomalies in time series using in-sample prediction intervals.
- Cross Validation: Robust performance evaluation of models.
- Multiple Seasonalities: Forecast data with multiple seasonalities using an MSTL.
- Predict Demand Peaks: Electricity load forecasting for detecting daily peaks and reducing electric bills.
- Intermittent Demand: Forecast series with very few non-zero observations.
- Exogenous Regressors: Utilize external variables like weather or prices in forecasting models.
In conclusion, StatsForecast 1.7.5 is a game-changer for univariate time series forecasting, offering speed, accuracy, and flexibility. Adding the MFLES model and scikit-learn integration expands the tool’s capabilities, making it an indispensable resource for data scientists and analysts. Whether forecasting demand peaks, detecting anomalies, or handling multiple seasonalities is needed, StatsForecast provides the tools and performance required.
The post Nixtla Releases StatsForecast 1.7.5: Elevating Time Series Forecasting with MFLES and Scikit-Learn Integration appeared first on MarkTechPost.