NVIDIA is offering a four-hour, self-paced course on MLOps
NVIDIA Deep Learning Institute (DLI) is offering a four-hour, self-paced course titled “Deploying a Model for Inference at Production Scale” that introduces MLOps coupled with hands-on practice with a live NVIDIA Triton Inference Server.
The course’s learning objectives include:
- Deploying neural networks from a variety of frameworks onto a live NVIDIA TritonServer.
- Measuring GPU usage and other metrics with Prometheus.
- Sending asynchronous requests to maximise throughput.
Upon completion, developers will deploy their models on an NVIDIA Triton Server.
What is Triton Server?
NVIDIA Triton Inference Server helps data scientists and system administrators turn the same machines you use to train your models into a web server for model prediction. Though GPU is not required, the server can take advantage of multiple installed GPUs to quickly process large batches of requests.
NVIDIA Triton was created with Machine Learning Operations (MLOps). This is a relatively new field evolved from Developer Operations, or DevOps, to focus on scaling and maintaining ML models in a production environment.
NVIDIA Triton is equipped with features such as model versioning for easy rollbacks. Triton is also compatible with Prometheus to track and manage server metrics such as latency and request count.
You can enrol for the course here.




