05 Apr How to train compute optimal large language models? Shraddha Goled AI Cloud DeepMind Google New research from DeepMind attempts to investigate the optimal model size and the number of tokens for training a transformer language model under a given compute budget.