«`html

Meet ARGUS: A Scalable AI Framework for Training Large Recommender Transformers to One Billion Parameters

Yandex has introduced ARGUS (AutoRegressive Generative User Sequential modeling), a large-scale transformer-based framework for recommender systems that scales up to one billion parameters. This breakthrough places Yandex among a select group of global technology leaders, such as Google, Netflix, and Meta, that have successfully overcome long-standing technical barriers in scaling recommender transformers.

Breaking Technical Barriers in Recommender Systems

Recommender systems have long struggled with three primary constraints: short-term memory, limited scalability, and poor adaptability to shifting user behavior. Conventional architectures often trim user histories down to a small window of recent interactions, discarding valuable behavioral data. The result is a limited view of user intent that fails to capture long-term habits, subtle shifts in taste, and seasonal cycles. As catalogs expand into the billions of items, these truncated models lose precision and struggle to meet the computational demands of personalization at scale. The outcome is stale recommendations, lower engagement, and fewer opportunities for serendipitous discovery.

Very few companies have successfully scaled recommender transformers beyond experimental setups. Google, Netflix, and Meta have invested heavily in this area, reporting gains from architectures such as YouTubeDNN, PinnerFormer, and Meta’s Generative Recommenders. With ARGUS, Yandex joins this select group, demonstrating billion-parameter recommender models in live services. By modeling entire behavioral timelines, the system uncovers both obvious and hidden correlations in user activity. This long-horizon perspective allows ARGUS to capture evolving intent and cyclical patterns with greater fidelity.

Technical Innovations Behind ARGUS

The framework introduces several key advances:

Dual-objective pre-training: ARGUS decomposes autoregressive learning into two subtasks — next-item prediction and feedback prediction, improving both imitation of historical system behavior and modeling of true user preferences.
Scalable transformer encoders: Models scale from 3.2 M to 1 B parameters, with consistent performance improvements across all metrics. At the billion-parameter scale, pairwise accuracy uplift increased by 2.66%, demonstrating the emergence of a scaling law for recommender transformers.
Extended context modeling: ARGUS can handle user histories up to 8,192 interactions long in a single pass, enabling personalization over months rather than just the last few clicks.
Efficient fine-tuning: A two-tower architecture allows offline computation of embeddings and scalable deployment, reducing inference costs relative to prior models.

Real-World Deployment and Measured Gains

ARGUS has already been deployed on Yandex’s music platform, serving millions of users. In production A/B tests, the system achieved:

+2.26% increase in total listening time (TLT)
+6.37% increase in like likelihood

These improvements represent the largest recorded quality enhancements on the platform for any deep learning–based recommender model.

Future Directions

Yandex researchers plan to extend ARGUS to real-time recommendation tasks, explore feature engineering for pairwise ranking, and adapt the framework to high-cardinality domains such as large e-commerce and video platforms. The demonstrated ability to scale user-sequence modeling with transformer architectures indicates that recommender systems are poised to follow a scaling trajectory similar to natural language processing.

Conclusion

With ARGUS, Yandex has established itself as a global leader in advancing state-of-the-art recommender systems. By openly sharing its breakthroughs, the company is enhancing personalization across its own services while also accelerating the evolution of recommendation technologies throughout the industry.

Check out the PAPER here. Thanks to the Yandex team for their thought leadership.

«`