Mistral AI Releases First Mamba Model, Codestral Mamba, for Code Generation

Mamba is Here to Mark the End of Transformers

Mistral AI has announced the release of its latest model Codestral Mamba 7B. This new model is based on the advanced Mamba 2 architecture, trained with a context length of 256k tokens, and is built for code generation tasks for developers worldwide.

Unlike traditional Transformer models, Codestral Mamba boasts efficient linear time inference, offering the theoretical ability to handle sequences of infinite length. This efficiency facilitates rapid interaction with the model, ensuring quick responses regardless of input size—a significant advantage for enhancing code productivity.

It supports a wide array of programming languages, including popular ones like Python, Java, C, C++, JavaScript, and Bash, as well as specialized languages such as Swift and Fortran. This extensive language support ensures that Codestral Mamba can be utilized across diverse coding environments and projects.

Mistral AI has conducted detailed benchmarks on Codestral Mamba, demonstrating its robust in-context retrieval capabilities up to 256k tokens. This capability positions Codestral Mamba as a promising tool for local code assistance, catering to diverse coding needs effectively.

Developers can deploy Codestral Mamba using the mistral-inference SDK, leveraging reference implementations from its GitHub repository. Additionally, deployment through TensorRT-LLM is supported, with plans for local inference capabilities through llama.cpp underway. For accessibility, raw weights of Codestral Mamba can be downloaded from HuggingFace.

Codestral Mamba is now available on la Plateforme (codestral-mamba-2407), alongside its counterpart Codestral 22B. While Codestral Mamba is licensed under Apache 2.0, Codestral 22B offers options for commercial deployment or community testing.

Mistral AI Releases First Mamba Model, Codestral Mamba, for Code Generation

Related Posts

Nagaland University Brings Fractals Into Quantum Research

Google Launches Agent Payments Protocol to Standardise AI Transactions

Chennai’s OrbitAID Opens Bengaluru Facility for On-Orbit Refuelling, Satellite Servicing