Archives for computer vision neural network

29 Jan

Hands-on guide to using Vision transformer for Image classification

image-30628
image-30628
Vision transformer (ViT) is a transformer used in the field of computer vision that works based on the working nature of the transformers used in the field of natural language processing. Internally, the transformer learns by measuring the relationship between input token pairs. In computer vision, we can use the patches of images as the token.