AI News — Страница 97

Unveiling Attention Sinks: The Functional Role of First-Token Focus in Stabilizing Large Language Models…

9 апреля, 2025

Unveiling Attention Sinks: The Functional Role of First-Token Focus in Stabilizing Large Language Models LLMs often show a peculiar behavior where the first token in a sequence draws unusually high attention—known as an & sink.&; Despite seemingly unimportant, this token frequently dominates attention across many heads in Transformer models. While prior research has explored when…

Read more →

TorchSim: A Next-Generation PyTorch-Native Atomistic Simulation Engine for the MLIP Era

9 апреля, 2025

TorchSim: A Next-Generation PyTorch-Native Atomistic Simulation Engine for the MLIP Era Radical AI has released TorchSim, a next-generation PyTorch-native atomistic simulation engine for the MLIP era. It accelerates materials simulation by orders of magnitude, transforming traditional scientific approaches. Current materials research requires large teams focused on single problems, resulting in slow progress and high costs.…

Read more →

OpenAI Introduces the Evals API: Streamlined Model Evaluation for Developers

9 апреля, 2025

OpenAI Introduces the Evals API: Streamlined Model Evaluation for Developers In a significant move to empower developers and teams working with large language models (LLMs), OpenAI has introduced the Evals API , a new toolset that brings programmatic evaluation capabilities to the forefront. While evaluations were previously accessible via the OpenAI dashboard, the new API…

Read more →

Salesforce AI Released APIGen-MT and xLAM-2-fc-r Model Series: Advancing Multi-Turn Agent Training with Verified Data Pipelines …

9 апреля, 2025

Salesforce AI Released APIGen-MT and xLAM-2-fc-r Model Series: Advancing Multi-Turn Agent Training with Verified Data Pipelines and Scalable LLM Architectures AI agents quickly become core components in handling complex human interactions, particularly in business environments where conversations span multiple turns and involve task execution, information extraction, and adherence to specific procedural rules. Unlike traditional chatbots…

Read more →

Huawei Noah’s Ark Lab Released Dream 7B: A Powerful Open Diffusion Reasoning Model with Advanced Planning and Flexible Inference…

9 апреля, 2025

Huawei Noah’s Ark Lab Released Dream 7B: A Powerful Open Diffusion Reasoning Model with Advanced Planning and Flexible Inference Capabilities LLMs have revolutionized artificial intelligence, transforming various applications across industries. Autoregressive (AR) models dominate current text generation, with leading systems like GPT-4, DeepSeek, and Claude all using sequential left-to-right architectures. Despite impressive capabilities, fundamental questions about next-generation architectural…

Read more →

This AI Paper from ByteDance Introduces MegaScale-Infer: A Disaggregated Expert Parallelism System for Efficient and Scalable Mo…

9 апреля, 2025

This AI Paper from ByteDance Introduces MegaScale-Infer: A Disaggregated Expert Parallelism System for Efficient and Scalable MoE-Based LLM Serving Large language models are built on transformer architectures and power applications like chat, code generation, and search, but their growing scale with billions of parameters makes efficient computation increasingly challenging. Scaling such systems while maintaining low…

Read more →

Sensor-Invariant Tactile Representation for Zero-Shot Transfer Across Vision-Based Tactile Sensors

8 апреля, 2025

Sensor-Invariant Tactile Representation for Zero-Shot Transfer Across Vision-Based Tactile Sensors Tactile sensing is a crucial modality for intelligent systems to perceive and interact with the physical world. The GelSight sensor and its variants have emerged as influential tactile technologies, providing detailed information about contact surfaces by transforming tactile data into visual images. However, vision-based tactile…

Read more →

This AI Paper Introduces an LLM+FOON Framework: A Graph-Validated Approach for Robotic Cooking Task Planning from Video Instruct…

8 апреля, 2025

This AI Paper Introduces an LLM+FOON Framework: A Graph-Validated Approach for Robotic Cooking Task Planning from Video Instructions Robots are increasingly being developed for home environments, specifically to enable them to perform daily activities like cooking. These tasks involve a combination of visual interpretation, manipulation, and decision-making across a series of actions. Cooking, in particular,…

Read more →

A Code Implementation to Use Ollama through Google Colab and Building a Local RAG Pipeline on Using DeepSeek-R1 1.5B through Oll…

8 апреля, 2025

A Code Implementation to Use Ollama through Google Colab and Building a Local RAG Pipeline on Using DeepSeek-R1 1.5B through Ollama, LangChain, FAISS, and ChromaDB for Q&A In this tutorial, we’ll build a fully functional Retrieval-Augmented Generation () pipeline using open-source tools that run seamlessly on Google Colab. First, we will look into how to…

Read more →

This AI Paper Introduces Inference-Time Scaling Techniques: Microsoft’s Deep Evaluation of Reasoning Models on Complex Tasks…

8 апреля, 2025

This AI Paper Introduces Inference-Time Scaling Techniques: Microsoft’s Deep Evaluation of Reasoning Models on Complex Tasks Large language models are often praised for their linguistic fluency, but a growing area of focus is enhancing their reasoning ability—especially in contexts where complex problem-solving is required. These include mathematical equations and tasks involving spatial logic, pathfinding, and…

Read more →