ServiceNow AI Releases Apriel-1.5-15B-Thinker: An Open-Weights Multimodal Reasoning Model that Hits Frontier-Level Performance on a Single-GPU Budget

Understanding the Target Audience

The target audience for the ServiceNow AI model release includes AI researchers, data scientists, business managers, and IT decision-makers who are interested in implementing advanced AI solutions. Their pain points often revolve around the high costs and complexity of deploying AI models, as well as the need for models that can operate efficiently within existing infrastructure constraints. Their goals include enhancing operational efficiency, improving decision-making processes, and leveraging AI for competitive advantage. They prefer clear, concise communication that focuses on technical specifications, real-world applications, and measurable outcomes.

Overview of Apriel-1.5-15B-Thinker

ServiceNow AI Research Lab has released the Apriel-1.5-15B-Thinker, a 15-billion-parameter open-weights multimodal reasoning model. This model is trained using a data-centric mid-training recipe, which includes continual pretraining followed by supervised fine-tuning, and does not utilize reinforcement learning or preference optimization. The model achieves an Artificial Analysis Intelligence Index (AAI) score of 52 while offering 8x cost savings compared to state-of-the-art (SOTA) models. The model is available under an MIT license on Hugging Face.

Key Features

Frontier-level composite score: The model reports an AAI of 52, matching the performance of DeepSeek-R1-0528 while being significantly smaller.
Single-GPU deployability: The model is designed to fit on a single GPU, making it suitable for on-premises and air-gapped deployments.
Open weights and reproducible pipeline: The weights, training recipe, and evaluation protocol are publicly available for independent verification.

Training Mechanism

The training mechanism consists of two primary stages:

Base and upscaling: The model is built on Mistral’s Pixtral-12B-Base-2409 multimodal decoder-vision stack, with depth upscaling from 40 to 48 decoder layers.
Continual Pretraining (CPT): Involves mixed text and image data to develop foundational reasoning and understanding, followed by targeted synthetic visual tasks to enhance spatial and compositional reasoning.
Supervised Fine-Tuning (SFT): Utilizes high-quality instruction data across various domains, merging two additional SFT runs to create the final checkpoint.

Approximately 25% of the depth-upscaling text mix is derived from NVIDIA’s Nemotron collection.

Results and Performance Metrics

The model has demonstrated competitive performance across several key benchmarks:

AIME 2025: 87.5–88%
GPQA Diamond: ≈71%
IFBench: ~62%
τ²-Bench Telecom: ~68%
LiveCodeBench: ~72.8%

Using VLMEvalKit for reproducibility, Apriel scores competitively across various benchmarks, including MMMU, LogicVista, MathVision, and others, with particularly strong results on document and diagram understanding as well as text-dominant math imagery.

Conclusion

The Apriel-1.5-15B-Thinker showcases that careful mid-training can yield a competitive AAI score of 52 while ensuring single-GPU deployability. Its combination of open weights, reproducible training recipes, and cost-effective performance positions it as a practical option for enterprises considering advanced AI solutions before investing in larger, closed systems.

The model is available for further exploration at Hugging Face.