←back to Blog

Salesforce AI Research Releases CoDA-1.7B: a Discrete-Diffusion Code Model with Bidirectional, Parallel Token Generation

Salesforce AI Research Releases CoDA-1.7B: A Discrete-Diffusion Code Model with Bidirectional, Parallel Token Generation

Understanding the Target Audience

The target audience for the CoDA-1.7B release primarily includes:

  • Data scientists and machine learning engineers looking for advanced code generation tools.
  • Business managers and decision-makers in tech companies interested in leveraging AI for software development.
  • Researchers and academics focused on AI and natural language processing.

Common pain points for this audience include:

  • Challenges in generating high-quality code efficiently.
  • Need for models that balance performance with resource consumption.
  • Desire for reproducible research and deployment pipelines.

Goals include:

  • Improving code generation accuracy and speed.
  • Integrating AI models into existing workflows seamlessly.
  • Staying updated with the latest advancements in AI technology.

Interests often revolve around:

  • Innovative AI applications in software development.
  • Benchmarking and comparing model performance.
  • Open-source tools and collaborative research.

Preferred communication methods include technical documentation, research papers, and community forums.

Overview of CoDA-1.7B

Salesforce AI Research has introduced CoDA-1.7B, a diffusion-based language model designed for code generation. This model employs denoising techniques to generate code sequences with bidirectional context, allowing for parallel token updates rather than traditional left-to-right predictions. The research team has made available both Base and Instruct checkpoints, along with a comprehensive training, evaluation, and serving stack.

Architecture and Training

CoDA utilizes a 1.7B-parameter backbone adapted for discrete diffusion in text. The model employs a three-stage pipeline:

  • Pre-training with bidirectional masking
  • Supervised post-training
  • Progressive denoising during inference

This architecture enables native infilling and non-autoregressive decoding, enhancing the model’s efficiency and effectiveness.

Key Features

  • Bidirectional context through diffusion denoising, eliminating fixed generation order.
  • Confidence-guided sampling (entropy-style decoding) to balance quality and speed.
  • An open training pipeline with deployment scripts and command-line interface (CLI).

Benchmark Performance

CoDA-1.7B-Instruct has demonstrated competitive performance on standard code generation benchmarks:

  • HumanEval: 54.3%
  • HumanEval+: 47.6%
  • MBPP: 47.2%
  • MBPP+: 63.2%
  • EvalPlus aggregate: 55.4% (pass@1)

These results indicate that CoDA’s performance is comparable to some 7B diffusion models, such as Dream-7B, while utilizing fewer parameters.

Inference Behavior

The generation cost in CoDA is influenced by the number of diffusion steps. The model allows for tuning of latency and quality trade-offs through parameters such as STEPS, ALG=»entropy», ALG_TEMP, and block length. This design aims to achieve lower wall-clock latency at smaller scales compared to larger diffusion models.

Deployment and Licensing

The release includes a FastAPI server with OpenAI-compatible APIs and an interactive CLI for local inference. Comprehensive instructions for environment setup and server launch are provided. The model checkpoints are published under CC BY-NC 4.0 on Hugging Face, ensuring accessibility for further research and development.

Conclusion

CoDA-1.7B serves as a robust reference for discrete-diffusion code generation at a smaller scale, featuring 1.7B parameters, bidirectional denoising, and parallel token updates. The reported benchmark results position it competitively against larger models while maintaining operational efficiency. The release includes essential resources for deployment and further exploration.

For more information, check out the Model on Hugging Face.