MoViNets are a family of CNNs that efficiently process video streams and accurate output predictions with a fraction of the latency of CNN video classifiers.