Alibaba’s Cloud Business Gets Qwen-ched!

In a bid to lower the barrier to 3D digital human creation, researchers from Alibaba recently unveiled a text-to-3D model, Make-A-Character (aka Mach). This new tool leverages large language and vision foundation models to generate detailed and lifelike 3D avatars from simple text descriptions, or natural language. 

Check out the GitHub repository here

The researchers said that its current version focuses on generating visually appealing 3D avatars of Asian ethnicity, as its selected SD model is primarily trained on Asian facial images. They look to expand support for different ethnicities and styles in the coming months. 

Further, the researchers said that its de-lighting datasets only consist of clean face textures. The generated avatars may weaken non-natural facial patterns like scribbles or stickers. “Currently, our garments and body parts are pre-produced and matched based on textual similarity. However, we are actively working on developing cloth, expression, and motion generation techniques driven by text prompts,” shared the researchers. 

How it works? 

Alibaba’s Mach seamlessly converts textual descriptors into visual avatars, providing users with a simple way to create custom avatars that resonate with their intended personas. 

The way it works is that these semantic attributes (prompts) are then mapped to corresponding visual clues, which in turn guide the generation of reference portrait images using Stable Diffusion along with ControlNet. 

Once that is done, through a series of 2D face parsing and 3D generation modules, the mesh and textures of the target face are generated and assembled along with additional matched accessories. Later the parameterised representation enables easy animation of the generated 3D avatar.

Other AI models 

A few days ago, Alibaba also addressed the challenge of 2D to 3D generation by unveiling Richdreamer, a normal-depth diffusion model. Additionally, Alibaba introduced ‘Animate Anyone,’ an advanced character animation technology utilizing diffusion models for transforming static images into dynamic character videos. 

Building on this momentum, Alibaba recently launched Qwen-72B, a language model with increased parameters and enhanced customization, following the earlier release of Qwen-7B in October. Moreover, it presented a smaller language model, Qwen-1.8B, as a gift to the research community, featuring a 2K context length and a modest 3GB GPU memory requirement. 

The post Alibaba Makes AI Agents Come to Life with Make-A-Character  appeared first on Analytics India Magazine.