The rapid advancement of Large Language Models (LLMs) has significantly improved their ability to generate long-form responses. However, evaluating these responses efficiently and fairly remains a critical challenge. Traditionally, human evaluation has been the gold standard, but it is costly, time-consuming, and prone to bias. To mitigate these limitations, the LLM-as-a-Judge paradigm has emerged, leveraging…
Agentic AI stands at the intersection of autonomy, intelligence, and adaptability, offering solutions that can sense, reason, and act in real or virtual environments with minimal human oversight. At its core, an “agentic” system perceives environmental cues, processes them in light of existing knowledge, arrives at decisions through reasoning, and ultimately acts on those decisions—all…
Knowledge Tracing (KT) plays a crucial role in Intelligent Tutoring Systems (ITS) by modeling students’ knowledge states and predicting their future performance. Traditional KT models, such as Bayesian Knowledge Tracing (BKT) and early deep learning-based approaches like Deep Knowledge Tracing (DKT), have demonstrated effectiveness in learning student interactions. However, recent advancements in deep sequential KT…
Knowledge graphs have been used tremendously in the field of enterprise lately, with their applications realized in multiple data forms from legal persons to registered capital and shareholder’s details. Although graphs have high utility, they have been criticized for intricate text-based queries and manual exploration, which obstruct the extraction of pertinent information. With the massive…
The critical issue of restricted access to high-quality reasoning datasets has limited open-source AI-driven logical and mathematical reasoning advancements. While proprietary models have leveraged structured reasoning demonstrations to enhance performance, these datasets and methodologies remain closed, restricting independent research and innovation. The lack of open, scalable reasoning datasets has created a bottleneck for AI development.…
Tokenization plays a fundamental role in the performance and scalability of Large Language Models (LLMs). Despite being a critical component, its influence on model training and efficiency remains underexplored. While larger vocabularies can compress sequences and reduce computational costs, existing approaches tie input and output vocabularies together, creating trade-offs where scaling benefits larger models but…
Yandex, a global tech company, develops and open-sources Perforator, an innovative tool for continuous real-time monitoring and analysis of servers and applications. Perforator helps developers identify the most resource-intensive sections of code and provides detailed statistics for subsequent optimization. By identifying code inefficiencies and supporting profile-guided optimization, Perforator delivers accurate data that enables businesses to…
Post-training quantization (PTQ) focuses on reducing the size and improving the speed of large language models (LLMs) to make them more practical for real-world use. Such models require large data volumes, but strongly skewed and highly heterogeneous data distribution during quantization presents considerable difficulties. This would inevitably expand the quantization range, making it, in most…
Significant progress has been made in short-form instrumental compositions in AI and music generation. However, creating full songs with lyrics, vocals, and instrumental accompaniment is still challenging for existing models. Generating a full-length song from lyrics poses several challenges. The music is long, requiring AI models to maintain consistency and coherence over several minutes. The…
What is an Agent? An agent is a Large Language Model (LLM)-powered system that can decide its own workflow. Unlike traditional chatbots, which operate on a fixed path (ask → answer), agents are capable of: Choosing between different actions based on context. Using external tools such as web search, databases, or APIs. Looping between steps…