Archives for NVIDIA Megaton
10
Feb
Microsoft, NVIDIA test waters for a large-scale generative language model with promising results


We believe that our results and findings can help, shape, and facilitate future research in foundational, large-scale pretraining.
The key is that 1T was never ‘trained to convergence.’

