Following the announcement of open source release of DeepSpeed library and Zero Redundancy Optimiser (ZeRO), Microsoft, in mid of this year, announced its upgrade, in order to train large neural networks, with ZeRO-2. Training large scale models often comes with several challenges, such as hardware limitations and tradeoffs with computation and efficiency. Thus, to overcome…

The post Training Models With Over 100 Billion Parameters appeared first on Analytics India Magazine.