AI startup Cohere announced Aya, a massively multilingual generative language model that follows instructions in 101 languages, of which over 50% are considered lower-resourced.

Aya is available in Indian languages such as Hindi, Marathi, Malayalam, Gujarati, and Telugu. 

“Developed using a diverse mix of instructions from the Aya dataset and collection among others, it achieves state-of-the-art performance across numerous multilingual benchmarks,” Cohere said in a blog post. 

Aya outperforms mT0 and BLOOMZ on the majority of tasks while covering double the number of languages.

“We introduce extensive new evaluation suites that broaden the state-of-art for multilingual eval across 99 languages — including discriminative and generative tasks, human evaluation, and simulated win rates that cover both held-out tasks and in-distribution performance,” researchers from Cohere said in the research paper.

Aya, spearheaded by Cohere for AI, engages over 3,000 independent researchers across 119 countries. Cohere’s decision to open-source both the model and dataset is significant, especially considering the scarcity of datasets for AI in various vernacular languages.

Cohere describes Aya as one of the most extensive open science initiatives in machine learning, reshaping research by partnering with global independent researchers.

The post Cohere Unveils Aya: Open-Source, Multilingual Model in 101 Languages appeared first on Analytics India Magazine.