The Curious Case of Generative AI

FastSaaS is a growing trend, with companies leveraging AI-generative and no-code capabilities to create their product. VCs have been pouring money into generative AI startups. According to data from Pitchbook, in 2021 and 2022, the total VC funds amounted to $1130 and $1300 million, respectively, relative to 2020 figures of merely $230 million. But, there have been looming concerns that perhaps all companies are rushing to be the next Super app.

Companies are looking to host as many AI services as possible using a single API. For instance, last month, Notion released its AI platform housing AI writing services, including grammar and spell check, paraphrasing, and translation. The influx of Super apps has threatened existing companies focused on one specific use case.

As a result, there are questions about what differentiates these ‘all-in-one’ companies apart from design, marketing, and use cases. But, as Chris Frantz, co-founder of Loops, iterates, this also leads one to believe “there is almost no moat in generative AI.”

Read: The Birth of AI-powered FastSaaS

However, this seems to be changing. Recently, Jasper—the AI-content platform—announced that it would partner with the American AI startup Cerebras Systems. The company will use Cerebras’ Andromeda AI supercomputer to train GPT networks, creating outputs of varying levels of end-user complexity. Additionally, the AI supercomputer is also said to improve the contextual accuracy of the generative model while providing personalised content across different users.

Regarding the partnership, venture capitalist Nathan Benaich says it looks like Jasper may move forward to decrease its reliance on OpenAI’s API by building its own models and training them on Cerebras, going beyond training GPT-3 on Cerebras systems.

The two AI platforms—Jasper and Notion—have taken different approaches to AI integration. While Jasper is using the AI-accelerating computing power of Cerebras, Notion is supported by Google Cloud, which will use the Cloud TPU for training the API. Although Notion has not confirmed it yet, it is widely believed that the kind of output it generates suggests that it is using OpenAI API’s GPT-3.

Therefore, in the era of GPT-3 companies, Jasper will look to set a new benchmark for what can be the moat in generative AI companies. The API used and the means taken for training the model will be the defining factor separating the companies. This also directly supports that the present and future of software are cloud services and supercomputing services.

Read: India’s Answer to Moore’s Law Death

The following are some of the approaches and the differences between them.

CS-2 versus Cloud versus GPU

The Andromeda AI supercomputer is built by linking 16 Cerebras CS-2 systems powered by the largest AI chip, the Wafer Scale Engine (WSE) 2. Cerebras’ ‘weight streaming’ technology provides immense flexibility, allowing for independent scaling of the model size and training speed. In addition, the cluster of CS-2 machines has training and inference acceleration that can support even trillion parameter models. Cerebras also claims that their CS-2 machines can form a cluster of up to 192 systems with near-linear performance scaling to speed up training.

Further, a single CS-2 system can clock a compute performance of tens to hundreds of graphics processing units (GPU) and deliver output that would normally take days and weeks on general-purpose processors to generate in a fraction of the time.

In contrast, the Cloud uses custom silicon chips to accelerate AI workloads. For example, Google Cloud employs its in-house chip, the Tensor Processing Unit (TPU), to train large, complex neural networks using Google’s own TensorFlow software.

Cloud TPUs are ‘virtual machines’ that offload networking processors onto the hardware. The model parameters are kept in on-chip, high-bandwidth memory. The TensorFlow server fetches input training data and pre-processes it before streaming it into an ‘infeed’ queue on the Cloud TPU hardware.

Additionally, Cloud has also been increasing its GPU offerings. For instance, the latest AWS P4d and G4 instances are powered by NVIDIA A100 Tensor Core GPUs. Earlier this year, Microsoft Azure also announced the adoption of NVIDIA’s Quantum-2 to power next-generation HPC needs. The cloud instances are widely used as they come fully configured for deep learning with accelerated libraries like CUDA, cuDNN, TensorFlow, and other well-known deep learning frameworks pre-installed.

Andrew Feldman, CEO and co-founder of Cerebras Systems, explained that the variable latency between large numbers of GPUs in traditional cloud providers creates difficult, time-consuming problems when distributing a large AI model among GPUs, and there are “large swings in time to train.”

According to ZDNET, the ‘pay-per-model’ AI cloud services of Cerebras’ system are $2,500 for training a GPT-3 model with 1.3 billion parameters in 10 hours to $2.5 million for training one with 70 billion parameters in 85 days, costing on average half of what customers would pay to rent cloud capacity or lease machines for years to do the task.

The same CS-2 clusters are also eight times faster to train than the training clusters of NVIDIA A100 machines in the Cloud. Whereas, according to MLPerf, when similar batches are run on TPUs and GPUs with the same number of chips, they almost exhibit the same training performance in SSD and Transformer benchmarks.

But, as Mahmoud Khairy points out in his blog, the performance depends on various metrics beyond the cost and training speed, and, hence, the answer to which approach is best also depends on the kind of computation that needs to be done. At the same time, the Cerebras CS-2 system is emerging as one of the most powerful tools in training vast neural networks.

Read: This Large Language Model Predicts COVID Variants

The AI supercomputing service provider is also extending itself to Cloud by partnering with Cirrascale cloud services to democratise cloud services and give its users the ability to train the GPT model at much cheaper costs than existing cloud providers and with only a few lines of code.

The post The Curious Case of Generative AI appeared first on Analytics India Magazine.

The Curious Case of Generative AI

CS-2 versus Cloud versus GPU

Related Posts

GenAI Infrastructure Will Drive Public Cloud Models at the Edge

GenAI Infrastructure Will Drive Public Cloud Models at the Edge

GenAI Infrastructure Will Drive Public Cloud Models at the Edge