←back to Blog

Mistral AI Releases Devstral 2507 for Code-Centric Language Modeling

«`html

Mistral AI Releases Devstral 2507 for Code-Centric Language Modeling

Target Audience Analysis

The primary audience for the Devstral 2507 release includes software developers, data scientists, and technical project managers. These professionals are typically focused on improving coding efficiency, automating software development processes, and integrating AI tools into their workflows. Their pain points often revolve around:

  • Time-consuming code debugging and patching.
  • Challenges in managing large codebases and repositories.
  • Need for reliable AI tools that enhance productivity without excessive costs.

Goals for this audience include:

  • Streamlining development processes through automation.
  • Improving code quality and reducing errors.
  • Leveraging AI to enhance team collaboration and efficiency.

Interests often lie in the latest advancements in AI technology, open-source tools, and effective integration strategies for development environments. Communication preferences tend to favor concise, technical documentation and hands-on tutorials.

Overview of Devstral 2507 Release

Mistral AI, in collaboration with All Hands AI, has released updated versions of its developer-focused large language models under the Devstral 2507 label. This release includes two models—Devstral Small 1.1 and Devstral Medium 2507—optimized for agent-based code reasoning, program synthesis, and structured task execution across large software repositories.

Devstral Small 1.1: Open Model for Local and Embedded Use

Devstral Small 1.1 (devstral-small-2507) is based on the Mistral-Small-3.1 foundation model and contains approximately 24 billion parameters. It supports a 128k token context window, enabling it to handle multi-file code inputs and long prompts typical in software engineering workflows.

The model is fine-tuned for structured outputs, including XML and function-calling formats, making it compatible with agent frameworks such as OpenHands. It is suited for tasks like program navigation, multi-step edits, and code search. The model is licensed under Apache 2.0 and is available for both research and commercial use.

Performance: SWE-Bench Results

Devstral Small 1.1 achieves 53.6% on the SWE-Bench Verified benchmark, evaluating the model’s ability to generate correct patches for real GitHub issues. This improvement over version 1.0 positions it ahead of other openly available models of comparable size. The results were obtained using the OpenHands scaffold, providing a standard test environment for evaluating code agents.

While not matching the largest proprietary models, this version balances size, inference cost, and reasoning performance, making it practical for many coding tasks.

Deployment: Local Inference and Quantization

The model is released in multiple formats, including quantized versions in GGUF for use with llama.cpp, vLLM, and LM Studio. These formats enable local inference on high-memory GPUs (e.g., RTX 4090) or Apple Silicon machines with 32GB RAM or more, benefiting developers or teams that prefer to operate without reliance on hosted APIs.

Additionally, Mistral offers the model via their inference API, with pricing set at $0.10 per million input tokens and $0.30 per million output tokens, consistent with other models in the Mistral-Small line.

Devstral Medium 2507: Higher Accuracy, API-Only

Devstral Medium 2507 is not open-sourced and is available only through the Mistral API or enterprise deployment agreements. It offers the same 128k token context length as the Small version but with higher performance.

This model scores 61.6% on SWE-Bench Verified, outperforming several commercial models, including Gemini 2.5 Pro and GPT-4.1, in the same evaluation framework. Its enhanced reasoning capacity over long contexts makes it suitable for code agents operating across large monorepos or repositories with cross-file dependencies.

API pricing is set at $0.40 per million input tokens and $2 per million output tokens, with fine-tuning available for enterprise users via the Mistral platform.

Comparison and Use Case Fit

Model SWE-Bench Verified Open Source Input Cost Output Cost Context Length
Devstral Small 1.1 53.6% Yes $0.10/M $0.30/M 128k tokens
Devstral Medium 61.6% No $0.40/M $2.00/M 128k tokens

Devstral Small is more suitable for local development, experimentation, or integration into client-side developer tools where control and efficiency are paramount. In contrast, Devstral Medium provides stronger accuracy and consistency in structured code-editing tasks, intended for production services that benefit from higher performance despite increased costs.

Integration with Tooling and Agents

Both models support integration with code agent frameworks such as OpenHands. Their compatibility with structured function calls and XML output formats facilitates integration into automated workflows for test generation, refactoring, and bug fixing. This makes it easier to connect Devstral models to IDE plugins, version control bots, and internal CI/CD pipelines.

For instance, developers can use Devstral Small for prototyping local workflows, while Devstral Medium can be employed in production services that apply patches or triage pull requests based on model suggestions.

Conclusion

The Devstral 2507 release represents a targeted update to Mistral’s code-oriented LLM stack, offering users a clearer tradeoff between inference cost and task accuracy. Devstral Small provides an accessible, open model with sufficient performance for many use cases, while Devstral Medium caters to applications where correctness and reliability are critical.

The availability of both models under different deployment options makes them relevant across various stages of the software engineering workflow—from experimental agent development to deployment in commercial environments.

Further Information

For technical details, Devstral Small model weights can be found on Hugging Face. Devstral Medium will also be available for enterprise customers and through the fine-tuning API. All credit for this research goes to the researchers involved in this project. For updates, follow Mistral AI on Twitter and YouTube, and consider joining the 100k+ ML SubReddit community and subscribing to their newsletter.

«`