In many of the cases, we see that the traditional neural networks are not capable of holding and working on long and large information. attention layer can help a neural network in memorizing the large sequences of data.