The Transformer architecture implements an encoder-decoder structure without recurrence and convolutions.