Microsoft Azure and Research teams are working to build a new AI infrastructure service called ‘Singularity.’ A paper published by the group working on the project has released details about the new AI platform. Microsoft has described Singularity as “a new AI platform service ground-up from scratch that will become a major driver for AI, both inside Microsoft and outside.”

Essentially, Singularity allows hundreds and thousands of GPUs and AI accelerators to function together. All devices in the infrastructure service are treated as a single cluster. This helps realise the full potential of all the devices and ensures there is no wastage of resources. 

In the research titled, “Singularity: Planet-Scale, Preemptible and Elastic Scheduling of AI Workloads,” the researchers said, “??At the heart of Singularity is a novel, workload-aware scheduler that can transparently preempt and elastically scale deep learning workloads to drive high utilisation without impacting their correctness or performance, across a global fleet of accelerators (e.g., GPUs, FPGAs).”

The platform has the ability to prioritise between different workloads. “While opportunistically using spare capacity, Singularity simultaneously provides isolation by respecting job-level SLAs,” says Microsoft. “For example, Singularity adapts to increasing load on an inference job, freeing up capacity by elastically scaling down or preempting training jobs.”

Singularity is also enabled to jump back in when a job is cut off, unlike many other systems that restart from the beginning after failure. 

Microsoft has made several recent investments in AI, such as the USD 1 billion it poured into OpenAI in 2019.