Archives for large language models - Page 6

06 May

How To Take Full Advantage Of GPUs In Large Language Models

“Training GPT-3 with 175 billion parameters would require approximately 36 years with 8 V100 GPUs.” Training large machine learning models calls for huge compute power (~in hundreds of exaflops), efficient memory management for a reduced memory footprint and other tweaks. But, language models have grown at a great pace. In a span of two years,…

The post How To Take Full Advantage Of GPUs In Large Language Models appeared first on Analytics India Magazine.

05 May

GPT-3’s Cheap Chinese Cousin

Shraddha Goled dataset

Chinese company Huawei has developed PanGu Alpha, a 750-gigabyte model that contains up to 200 billion parameters.

The post GPT-3’s Cheap Chinese Cousin appeared first on Analytics India Magazine.

22 Apr

Behind NVIDIA’s Megatron

Shraddha Goled Data parallelism

The team performed training iterations on models with a trillion parameters at 502 petaFLOP/s on 3072 GPUs by combining three techniques.

The post Behind NVIDIA’s Megatron appeared first on Analytics India Magazine.

« Previous 1 … 5 6