03 Jan Will we see GPT-3 moment for computer vision? Shraddha Goled CLIP Just as a large transformer model can be trained on language, similar models can be trained on pixel sequences to generate coherent image completions and samples.