GLaM is a sparse language model, which means it activates only a part of the architecture for a given task