The latest version of TensorRT brings BERT-Large inference latency down to 1.2 milliseconds.

The post NVIDIA Releases Eighth Generation Of Its Popular Conversational AI Software TensorRT appeared first on Analytics India Magazine.