Meta has reportedly considered abandoning the Llama 4 Behemoth, according to a July 14 report by the New York Times. The report indicates that a small group of senior staff at Meta’s newly announced superintelligence lab are now believed to be developing a closed-source model instead. 

The Llama 4 Behemoth is currently the company’s biggest and most powerful AI model that has been announced.  

According to the NYT report, Meta has completed training the Behemoth model but delayed its release due to ‘poor internal performance’. And after Meta announced the superintelligence lab last month, teams working on the model stopped running tests on it. 

Earlier, the Wall Street Journal reported, citing people familiar with the matter, that the tech giant delayed the rollout of the model, ‘prompting concerns about the direction of its multi-billion dollar AI investments.” These individuals also told the publication that Meta’s engineers and researchers were concerned that its performance would not match the public statements made about its capabilities. 

The model was expected to be released later this year. However, recent updates suggest that a release is now unlikely anytime soon. 

If Meta develops a closed-source model, it would represent a significant departure from the company’s long-standing approach of creating open-source AI models. 

In April, the company introduced the Llama 4 family of AI models, which includes three variants: Behemoth, Maverick, and Scout. The Behemoth is the largest, with a total of 2 trillion parameters. Meta dubbed the model as one of the most innovative AI models in the world. 

More recently, Meta intensified its focus on building a team for its superintelligence initiatives. Besides assembling a team led by former Scale AI CEO Alexandr Wang and recruiting several others from OpenAI, CEO Mark Zuckerberg stated that the company would spend hundreds of billions of dollars on AI data centres for these efforts. 

Despite these efforts to stay ahead of competitors, why is the company planning to abandon the Behemoth model? 

SemiAnalysis, one of the world’s leading AI and semiconductor research companies, outlined some of the reasons in its latest report

The State of Llama 4 Behemoth

Semianalysis suggests that Meta’s decision to use the chunked attention technique for memory efficiency may have been a mistake. 

Standard attention allows every token to access all previous tokens, forming a complete context. Chunked attention splits tokens into fixed blocks, limiting each token’s attention to only its current block.

“Behemoth’s implementation of chunked attention chasing efficiency created blind spots, especially at block boundaries,” read the report from SemiAnalysis. 

Each divided chunk can access tokens within its own block, but not those in the preceding block. Therefore, if a logical argument or chain of thought extends from one chunk to another, the model loses the connection.

This weakened the model’s ability to follow and reason across long chains of thought.

“We believe part of the problem was that Meta didn’t even have the proper long context evaluations or testing infrastructure set up to determine that chunked attention would not work for developing a reasoning model,” added the report. 

SemiAnalysis also said that Meta is ‘very far behind on RL and internal evaluations’ and the new superintelligence team is set to close the gap. Besides, the report added that Meta’s Behemoth model switched its Mixture of Experts routing method midway through training, disrupting how its expert networks specialised. This led to instabilities, ultimately limiting the model’s overall effectiveness.

Among other reasons, SemiAnalysis also states that Llama 4 Behemoth faced bottlenecks with regard to training data. 

“Prior to Llama 4 Behemoth, Meta had been using public data (like Common Crawl), but switched mid-run to an internal web crawler they built. While this is generally superior, it also backfired,” read the report, stating that Meta struggled to clean and deduplicate the new data stream. 

“The processes hadn’t been stress-tested at scale.”

The report further notes that Meta also struggled to scale research experiments into full-fledged training runs. The company had to deal with competing research directions and a lack of leadership to decide the most productive path for the model. 

“Certain model architecture choices did not have proper ablations but were thrown into the model. This led to poorly managed scaling ladders,” said the report. 

The post Meta Plans to Abandon Llama 4 Behemoth. But Why? appeared first on Analytics India Magazine.