Sequences are a universal abstraction for representing and processing information, making sequence modeling central to modern deep learning. By framing computational tasks as transformations between sequences, this perspective has extended to diverse fields such as NLP, computer vision, time series analysis, and computational biology. This has driven the development of various sequence models, including transformers,…
By intertwining the development of artificial intelligence combined with large language models with reinforcement learning in high-performance computation, the newly developed Reasoning Language Models may leap beyond traditional ways of limitation applied to processing by language systems toward explicit and even structured mechanisms, enabling complex reasoning solutions across diverse realms. Such model development achievement is…
Academic paper search represents a critical yet intricate information retrieval challenge within research ecosystems. Researchers require complex search capabilities that can navigate complex, specialized knowledge domains and address nuanced, fine-grained queries. Current academic search platforms like Google Scholar struggle to handle intricate research-specific investigations. For example, specialized query-seeking studies on non-stationary reinforcement learning (RL) using…
The advancement of artificial intelligence (AI) and machine learning (ML) has enabled transformative progress across diverse fields. However, the “system domain,” which focuses on optimizing and managing foundational AI infrastructure, remains relatively underexplored. This domain involves critical tasks such as diagnosing hardware issues, optimizing configurations, managing workloads, and evaluating system performance. These tasks often present…
Large language models (LLMs) have introduced impressive capabilities, particularly in reasoning tasks. Models like OpenAI’s O1 utilize “long-thought reasoning,” where complex problems are broken into manageable steps and solutions are refined iteratively. While this approach enhances problem-solving, it comes at a cost: extended output sequences lead to increased computational time and energy use. These inefficiencies…
Smartphones are essential tools in dAIly life. However, the complexity of tasks on mobile devices often leads to frustration and inefficiency. Navigating applications and managing multi-step processes consumes time and effort. Advancements in AI have introduced large multimodal models (LMMs) that enable mobile assistants to perform intricate operations autonomously. While these innovations aim to simplify…
The study of autonomous agents powered by large language models (LLMs) has shown great promise in enhancing human productivity. These agents are designed to assist in various tasks such as coding, data analysis, and web navigation. They allow users to focus on creative and strategic work by automating routine digital tasks. However, despite the advancements,…
The advent of advanced AI models has led to innovations in how machines process information, interact with humans, and execute tasks in real-world settings. Two emerging pioneering approaches are large concept models (LCMs) and large action models (LAMs). While both extend the foundational capabilities of large language models (LLMs), their objectives and applications diverge. LCMs…
Aligning large language models (LLMs) with human values is essential as these models become central to various societal functions. A significant challenge arises when model parameters cannot be updated directly because the models are fixed or inaccessible. In these cases, the focus is on adjusting the input prompts to make the model’s outputs match the…
Evaluating conversational AI systems powered by large language models (LLMs) presents a critical challenge in artificial intelligence. These systems must handle multi-turn dialogues, integrate domain-specific tools, and adhere to complex policy constraints—capabilities that traditional evaluation methods struggle to assess. Existing benchmarks rely on small-scale, manually curated datasets with coarse metrics, failing to capture the dynamic…