Mistral AI Releases First Mamba Model, Codestral Mamba, for Code Generation
Mistral AI has announced the release of its latest model Codestral Mamba 7B. This new model is based on the advanced Mamba 2 architecture, trained with a context length of 256k tokens, and is built for code generation tasks for developers worldwide.
Unlike traditional Transformer models, Codestral Mamba boasts efficient linear time inference, offering the theoretical ability to handle sequences of infinite length. This efficiency facilitates rapid interaction with the model, ensuring quick responses regardless of input size—a significant advantage for enhancing code productivity.
It supports a wide array of programming languages, including popular ones like Python, Java, C, C++, JavaScript, and Bash, as well as specialized languages such as Swift and Fortran. This extensive language support ensures that Codestral Mamba can be utilized across diverse coding environments and projects.
Mistral AI has conducted detailed benchmarks on Codestral Mamba, demonstrating its robust in-context retrieval capabilities up to 256k tokens. This capability positions Codestral Mamba as a promising tool for local code assistance, catering to diverse coding needs effectively.
Developers can deploy Codestral Mamba using the mistral-inference SDK, leveraging reference implementations from its GitHub repository. Additionally, deployment through TensorRT-LLM is supported, with plans for local inference capabilities through llama.cpp underway. For accessibility, raw weights of Codestral Mamba can be downloaded from HuggingFace.
Codestral Mamba is now available on la Plateforme (codestral-mamba-2407), alongside its counterpart Codestral 22B. While Codestral Mamba is licensed under Apache 2.0, Codestral 22B offers options for commercial deployment or community testing.


Codestral Mamba 7B is a Code LLM based on the Mamba2 architecture. Released under Apache 2.0 and achieves 75% on HumanEval for Python Coding. 


