AI News — Страница 79

Critical Security Vulnerabilities in the Model Context Protocol (MCP): How Malicious Tools and Deceptive Contexts Exploit AI Agents

19 мая, 2025

Critical Security Vulnerabilities in the Model Context Protocol (MCP): How Malicious Tools and Deceptive Contexts Exploit AI Agents The Model Context Protocol (MCP) represents a significant shift in how large language models interact with tools, services, and external data sources. Designed to enable dynamic tool invocation, the MCP facilitates a standardized method for describing tool…

Read more →

Reinforcement Learning Makes LLMs Search-Savvy: Ant Group Researchers Introduce SEM to Optimize Tool Usage and Reasoning Efficiency

19 мая, 2025

Reinforcement Learning Makes LLMs Search-Savvy: Ant Group Researchers Introduce SEM to Optimize Tool Usage and Reasoning Efficiency Recent advancements in large language models (LLMs) demonstrate their capability to perform complex reasoning tasks and efficiently utilize external tools such as search engines. A significant challenge remains in teaching models when to rely on internal knowledge versus…

Read more →

LLMs Struggle to Act on What They Know: Google DeepMind Researchers Use Reinforcement Learning Fine-Tuning to Bridge the Knowing-Doing Gap

19 мая, 2025

LLMs Struggle to Act on What They Know: Google DeepMind Researchers Use Reinforcement Learning Fine-Tuning to Bridge the Knowing-Doing Gap Language models trained on vast internet-scale datasets have emerged as essential tools for language understanding and generation. Their potential extends to functioning as decision-making agents in interactive environments. When applied to environments requiring action choices,…

Read more →

How to Build a Powerful and Intelligent Question-Answering System by Using Tavily Search API, Chroma, Google Gemini LLMs, and the LangChain Framework

18 мая, 2025

How to Build a Powerful and Intelligent Question-Answering System How to Build a Powerful and Intelligent Question-Answering System In this tutorial, we demonstrate how to build a powerful and intelligent question-answering system by combining the strengths of Tavily Search API, Chroma, Google Gemini LLMs, and the LangChain framework. The pipeline leverages real-time web search using…

Read more →

SWE-Bench Performance Reaches 50.8% Without Tool Use: A Case for Monolithic State-in-Context Agents

18 мая, 2025

SWE-Bench Performance Reaches 50.8% Without Tool Use: A Case for Monolithic State-in-Context Agents Recent advancements in language model (LM) agents have demonstrated significant potential for automating complex real-world tasks across various domains, including software engineering, robotics, and scientific experimentation. These agents typically operate by proposing and executing actions through APIs. As tasks grow in complexity,…

Read more →

AWS Open-Sources Strands Agents SDK to Simplify AI Agent Development

17 мая, 2025

AWS Open-Sources Strands Agents SDK to Simplify AI Agent Development Amazon Web Services (AWS) has open-sourced its Strands Agents SDK, aiming to make the development of AI agents more accessible and adaptable across various domains. By following a model-driven approach, the Strands Agents SDK abstracts much of the complexity behind building, orchestrating, and deploying intelligent…

Read more →

Google Researchers Introduce LightLab: A Diffusion-Based AI Method for Physically Plausible, Fine-Grained Light Control in Single Images

17 мая, 2025

Google Researchers Introduce LightLab: A Diffusion-Based AI Method for Physically Plausible, Fine-Grained Light Control in Single Images Manipulating lighting conditions in images post-capture poses significant challenges. Traditional methods often rely on 3D graphics techniques, which reconstruct scene geometry and properties from multiple images before simulating new lighting using physical illumination models. While these techniques allow…

Read more →

This AI paper from DeepSeek-AI Explores How DeepSeek-V3 Delivers High-Performance Language Modeling by Minimizing Hardware Overhead and Maximizing Computational Efficiency

17 мая, 2025

DeepSeek-AI’s DeepSeek-V3: Optimizing Language Modeling for Efficiency The development and deployment of large language models (LLMs) have been significantly influenced by architectural innovations, extensive datasets, and hardware advancements. Models such as DeepSeek-V3, GPT-4o, Claude 3.5 Sonnet, and LLaMA-3 have shown how scaling can enhance reasoning and dialogue capabilities. However, as performance improves, so do the…

Read more →

LLMs Struggle with Real Conversations: Microsoft and Salesforce Researchers Reveal a 39% Performance Drop in Multi-Turn Underspecified Tasks

17 мая, 2025

LLMs Struggle with Real Conversations: Microsoft and Salesforce Researchers Reveal a 39% Performance Drop in Multi-Turn Underspecified Tasks Conversational artificial intelligence focuses on enabling large language models (LLMs) to engage in dynamic interactions where user needs are revealed progressively. These systems are widely deployed in tools that assist with coding, writing, and research by interpreting…

Read more →

Windsurf Launches SWE-1: A Frontier AI Model Family for End-to-End Software Engineering

17 мая, 2025

Windsurf Launches SWE-1: A Frontier AI Model Family for End-to-End Software Engineering In a significant step towards integrating AI with software engineering, Windsurf has introduced SWE-1, its first family of AI models tailored for the complete software development lifecycle. This new approach moves beyond traditional code generation to support real-world software engineering workflows, addressing challenges…

Read more →