AI News — Страница 70

Teaching AI to Say ‘I Don’t Know’: A New Dataset Mitigates Hallucinations from Reinforcement Finetuning

6 июня, 2025

Teaching AI to Say ‘I Don’t Know’: A New Dataset Mitigates Hallucinations from Reinforcement Finetuning Reinforcement finetuning (RFT) employs reward signals to guide large language models (LLMs) toward producing desirable outputs. This method enhances the model’s ability to generate logical and structured responses by reinforcing correct answers. However, a significant challenge remains: ensuring these models…

Read more →

A Step-by-Step Coding Guide to Building an Iterative AI Workflow Agent Using LangGraph and Gemini

5 июня, 2025

«`html A Step-by-Step Coding Guide to Building an Iterative AI Workflow Agent Using LangGraph and Gemini In this tutorial, we demonstrate how to build a multi-step, intelligent query-handling agent using LangGraph and Gemini 1.5 Flash. The core idea is to structure AI reasoning as a stateful workflow, where an incoming query is passed through a…

Read more →

From Clicking to Reasoning: WebChoreArena Benchmark Challenges Agents with Memory-Heavy and Multi-Page Tasks

5 июня, 2025

From Clicking to Reasoning: WebChoreArena Benchmark Challenges Agents with Memory-Heavy and Multi-Page Tasks Web automation agents are becoming increasingly important in artificial intelligence due to their ability to perform human-like actions in digital environments. These agents interact with websites through Graphical User Interfaces (GUIs), mimicking human behaviors such as clicking, typing, and navigating across web…

Read more →

Salesforce AI Introduces CRMArena-Pro: The First Multi-Turn and Enterprise-Grade Benchmark for LLM Agents

5 июня, 2025

Salesforce AI Introduces CRMArena-Pro: The First Multi-Turn and Enterprise-Grade Benchmark for LLM Agents AI agents powered by large language models (LLMs) demonstrate significant potential for managing complex business tasks, particularly within Customer Relationship Management (CRM). However, evaluating their effectiveness in real-world situations is arduous due to a scarcity of publicly accessible, realistic business data. Past…

Read more →

NVIDIA AI Introduces ProRL: Extended Reinforcement Learning Training Unlocks New Reasoning Capabilities in Language Models

5 июня, 2025

NVIDIA AI Introduces ProRL: Extended Reinforcement Learning Training Unlocks New Reasoning Capabilities in Language Models Recent advances in reasoning-focused language models have marked a significant change in AI by scaling test-time computation. Reinforcement learning (RL) plays a crucial role in developing reasoning capabilities and mitigating reward hacking pitfalls. However, a fundamental debate remains: whether RL…

Read more →

H Company Releases Runner H Public Beta Alongside Holo-1 and Tester H for Developers

5 июня, 2025

H Company Releases Runner H Public Beta Alongside Holo-1 and Tester H for Developers The concept behind Agentic AI is that multiple small, task-focused agents can collaborate to accomplish real work. H Company, based in Paris, aims to transform this concept into a practical reality. They have announced three significant advancements, starting with the public…

Read more →

Mistral AI Introduces Mistral Code: A Customizable AI Coding Assistant for Enterprise Workflows

4 июня, 2025

Mistral AI Introduces Mistral Code: A Customizable AI Coding Assistant for Enterprise Workflows Mistral AI has unveiled Mistral Code, an AI-powered coding assistant designed specifically for enterprise software development. This release addresses critical needs in professional development environments: control, security, and model adaptability. Addressing Enterprise-Grade Requirements Mistral Code aims to overcome limitations found in conventional…

Read more →

LifelongAgentBench: A Benchmark for Evaluating Continuous Learning in LLM-Based Agents

4 июня, 2025

LifelongAgentBench: A Benchmark for Evaluating Continuous Learning in LLM-Based Agents Lifelong learning is essential for intelligent agents operating in dynamic environments. However, current LLM-based agents often lack memory, treating each task as a new challenge. While LLMs have significantly advanced language tasks and inspired agent-based systems, these agents remain stateless, unable to learn from past…

Read more →

NVIDIA AI Releases Llama Nemotron Nano VL: A Compact Vision-Language Model Optimized for Document Understanding

4 июня, 2025

NVIDIA AI Releases Llama Nemotron Nano VL: A Compact Vision-Language Model Optimized for Document Understanding NVIDIA has introduced Llama Nemotron Nano VL, a vision-language model (VLM) designed for efficient and precise document-level understanding tasks. Built on the Llama 3.1 architecture and featuring a lightweight vision encoder, this model is tailored for applications requiring accurate parsing…

Read more →