Apple has released a new open-source LLM, DCLM-Baseline 7B, featuring 7 billion parameters. This model, which includes weights, training code, and dataset, is trained on 2.5 trillion tokens from open datasets. It primarily uses English data and has a 2048-token context window.

The new model integrates data from DCLM-BASELINE, StarCoder, and ProofPile2, achieving an MMLU score of 0.6372, placing it between Mistral and Llama3 in performance metrics. It is licensed under the Apple Sample Code License and is available on Hugging Face and in Transformers. The model is trained using PyTorch with the OpenLM framework, matching the performance of closed-dataset models like Mistral.

This comes after Apple at WWDC 2024 introduced Apple Intelligence to enhance Siri’s capabilities with generative AI. For the same Apple built a 3 billion parameter on-device language model and a larger server-based model accessible via Private Cloud Compute on Apple silicon servers. These models were developed using Apple’s AXLearn framework, an open-source project based on JAX and XLA.

Earlier this Apple has also open sourced MM1 series, featuring multimodal AI models with 30 billion parameters, and ReALM, which combines text and images to improve interaction. 

The company had also released  ‘Ferret-UI,’ a multimodal AI model designed for precise task execution concerning user interfaces and handling open-ended language instructions. The core focus of Ferret-UI lies in its multimodal capabilities, combining advanced language understanding with visual comprehension tailored specifically for mobile UI screens, incorporating referring, grounding, and reasoning capabilities.