AI News — Страница 120

Simplifying Self-Supervised Vision: How Coding Rate Regularization Transforms DINO & DINOv2

27 февраля, 2025

Learning useful features from large amounts of unlabeled images is important, and models like DINO and DINOv2 are designed for this. These models work well for tasks like image classification and segmentation, but their training process is difficult. A key challenge is avoiding representation collapse, where the model produces the same output for different images.…

Read more →

Meta AI Introduces SWE-RL: An AI Approach to Scale Reinforcement Learning based LLM Reasoning for Real-World Software Engineering

27 февраля, 2025

Modern software development faces a multitude of challenges that extend beyond simple code generation or bug detection. Developers must navigate complex codebases, manage legacy systems, and address subtle issues that standard automated tools often overlook. Traditional approaches in automated program repair have largely relied on supervised learning techniques or proprietary systems that are not easily…

Read more →

Monte Carlo Tree Diffusion: A Scalable AI Framework for Long-Horizon Planning

27 февраля, 2025

Diffusion models are promising in long-horizon planning by generating complex trajectories through iterative denoising. However, their ability to improve performance through more computation at test time is minimal. In comparison to Monte Carlo Tree Search, whose strength lies in taking advantage of additional computational resources, typical diffusion-based planners will likely suffer from diminishing returns in…

Read more →

SongGen: A Fully Open-Source Single-Stage Auto-Regressive Transformer Designed for Controllable Song Generation

27 февраля, 2025

Creating songs from text is difficult because it involves generating vocals and instrumental music together. Songs are unique as they combine lyrics and melodies to express emotions, making the process more complex than generating speech or instrumental music alone. The challenge is intensified by the insufficient availability of quality open-source data, which restrains research and…

Read more →

Hume Introduces Octave TTS: A New Text-to-Speech Model that Creates Custom AI Voices with Tailored Emotions

27 февраля, 2025

In the rapidly evolving field of digital communication, traditional text-to-speech (TTS) systems have often struggled to capture the full range of human emotion and nuance. Conventional systems tend to “read” text in a flat, unvarying tone, missing the subtle inflections and emotional cues that make human speech so engaging. This shortfall poses a challenge for…

Read more →

Allen Institute for AI Released olmOCR: A High-Performance Open Source Toolkit Designed to Convert PDFs and Document Images into Clean and Structured Plain Text

26 февраля, 2025

Access to high-quality textual data is crucial for advancing language models in the digital age. Modern AI systems rely on vast datasets of token trillions to improve their accuracy and efficiency. While much of this data is from the internet, a significant portion exists in formats such as PDFs, which pose unique challenges for content…

Read more →

How to Compare Two LLMs in Terms of Performance: A Comprehensive Web Guide for Evaluating and Benchmarking Language Models

26 февраля, 2025

Comparing language models effectively requires a systematic approach that combines standardized benchmarks with use-case specific testing. This guide walks you through the process of evaluating LLMs to make informed decisions for your projects. Table of contents Step 1: Define Your Comparison Goals Step 2: Choose Appropriate Benchmarks General Language Understanding Reasoning & Problem-Solving Coding &…

Read more →

LongPO: Enhancing Long-Context Alignment in LLMs Through Self-Optimized Short-to-Long Preference Learning

26 февраля, 2025

LLMs have exhibited impressive capabilities through extensive pretraining and alignment techniques. However, while they excel in short-context tasks, their performance in long-context scenarios often falls short due to inadequate long-context alignment. This challenge arises from lacking high-quality, long-context annotated data, as human annotation becomes impractical and unreliable for extended contexts. Additionally, generating synthetic long-context data…

Read more →

DeepSeek AI Releases DeepGEMM: An FP8 GEMM Library that Supports both Dense and MoE GEMMs Powering V3/R1 Training and Inference

26 февраля, 2025

Efficient matrix multiplications remain a critical component in modern deep learning and high-performance computing. As models become increasingly complex, conventional approaches to General Matrix Multiplication (GEMM) often face challenges related to memory bandwidth constraints, numerical precision, and suboptimal hardware utilization. These issues are further complicated by the emerging use of mixed-precision formats, such as FP8,…

Read more →

Optimizing Imitation Learning: How X‑IL is Shaping the Future of Robotics

26 февраля, 2025

Designing imitation learning (IL) policies involves many choices, such as selecting features, architecture, and policy representation. The field is advancing quickly, introducing many new techniques and increasing complexity, making it difficult to explore all possible designs and understand their impact. IL enables agents to learn through demonstrations rather than reward-based approaches. The increasing number of…

Read more →