AI News — Страница 118

A-MEM: A Novel Agentic Memory System for LLM Agents that Enables Dynamic Memory Structuring without Relying on Static, Predetermined Memory Operations

2 марта, 2025

Current memory systems for large language model (LLM) agents often struggle with rigidity and a lack of dynamic organization. Traditional approaches rely on fixed memory structures—predefined storage points and retrieval patterns that do not easily adapt to new or unexpected information. This rigidity can hinder an agent’s ability to effectively process complex tasks or learn…

Read more →

Microsoft AI Released LongRoPE2: A Near-Lossless Method to Extend Large Language Model Context Windows to 128K Tokens While Retaining Over 97% Short-Context Accuracy

2 марта, 2025

Large Language Models (LLMs) have advanced significantly, but a key limitation remains their inability to process long-context sequences effectively. While models like GPT-4o and LLaMA3.1 support context windows up to 128K tokens, maintaining high performance at extended lengths is challenging. Rotary Positional Embeddings (RoPE) encode positional information in LLMs but suffer from out-of-distribution (OOD) issues…

Read more →

Tencent AI Lab Introduces Unsupervised Prefix Fine-Tuning (UPFT): An Efficient Method that Trains Models on only the First 8-32 Tokens of Single Self-Generated Solutions

2 марта, 2025

Unleashing a more efficient approach to fine-tuning reasoning in large language models, recent work by researchers at Tencent AI Lab and The Chinese University of Hong Kong introduces Unsupervised Prefix Fine-Tuning (UPFT). This method refines a model’s reasoning abilities by focusing solely on the first 8 to 32 tokens of its generated responses, rather than…

Read more →

Meet AI Co-Scientist: A Multi-Agent System Powered by Gemini 2.0 for Accelerating Scientific Discovery

1 марта, 2025

Biomedical researchers face a significant dilemma in their quest for scientific breakthroughs. The increasing complexity of biomedical topics demands deep, specialized expertise, while transformative insights often emerge at the intersection of diverse disciplines. This tension between depth and breadth creates substantial challenges for scientists navigating an exponentially growing volume of publications and specialized high-throughput technologies.…

Read more →

This AI Paper Introduces UniTok: A Unified Visual Tokenizer for Enhancing Multimodal Generation and Understanding

1 марта, 2025

With researchers aiming to unify visual generation and understanding into a single framework, multimodal artificial intelligence is evolving rapidly. Traditionally, these two domains have been treated separately due to their distinct requirements. Generative models focus on producing fine-grained image details while understanding models prioritize high-level semantics. The challenge lies in integrating both capabilities effectively without…

Read more →

IBM AI Releases Granite 3.2 8B Instruct and Granite 3.2 2B Instruct Models: Offering Experimental Chain-of-Thought Reasoning Capabilities

1 марта, 2025

Large language models (LLMs) leverage deep learning techniques to understand and generate human-like text, making them invaluable for various applications such as text generation, question answering, summarization, and retrieval. While early LLMs demonstrated remarkable capabilities, their high computational demands and inefficiencies made them impractical for enterprise-scale deployment. Researchers have developed more optimized and scalable models…

Read more →

This AI Paper Introduces Agentic Reward Modeling (ARM) and REWARDAGENT: A Hybrid AI Approach Combining Human Preferences and Verifiable Correctness for Reliable LLM Training

1 марта, 2025

Large Language Models (LLMs) rely on reinforcement learning techniques to enhance response generation capabilities. One critical aspect of their development is reward modeling, which helps in training models to align better with human expectations. Reward models assess responses based on human preferences, but existing approaches often suffer from subjectivity and limitations in factual correctness. This…

Read more →

Google AI Introduces PlanGEN: A Multi-Agent AI Framework Designed to Enhance Planning and Reasoning in LLMs through Constraint-Guided Iterative Verification and Adaptive Algorithm Selection

1 марта, 2025

Large language models have made remarkable strides in natural language processing, yet they still encounter difficulties when addressing complex planning and reasoning tasks. Traditional methods often rely on static templates or single-agent systems that fall short in capturing the subtleties of real-world problems. This shortfall is evident when models must verify generated plans, adapt to…

Read more →

Thinking Harder, Not Longer: Evaluating Reasoning Efficiency in Advanced Language Models

1 марта, 2025

Large language models (LLMs) have progressed beyond basic natural language processing to tackle complex problem-solving tasks. While scaling model size, data, and compute has enabled the development of richer internal representations and emergent capabilities in larger models, significant challenges remain in their reasoning abilities. Current methodologies struggle to maintain coherence throughout complex problem-solving processes, particularly…

Read more →

This AI Paper from USC Introduces FFTNet: An Adaptive Spectral Filtering Framework for Efficient and Scalable Sequence Modeling

1 марта, 2025

Deep learning models have significantly advanced natural language processing and computer vision by enabling efficient data-driven learning. However, the computational burden of self-attention mechanisms remains a major obstacle, particularly for handling long sequences. Traditional transformers require pairwise comparisons that scale quadratically with sequence length, making them impractical for tasks involving extensive data. Researchers have been…

Read more →