when the elements of the gradient become exponentially small so that the update of the parameters with the gradient becomes almost insignificant