Hugging Face on Tuesday launched Open Computer Agent, a free, cloud-hosted AI agent that can operate a virtual computer using text instructions. The tool lets users remotely control a Linux-based machine through a browser.

The agent makes use of smolagents, Qwen2-VL-72B (a vision language model), and E2B Desktop under the hood to power it up.

As reported by TechCrunch, the agent is equipped with common applications like Firefox and responds to plain-English prompts, such as opening a website or searching for directions. While it manages basic commands effectively, tasks with more complexity often may not be its cup of tea yet. 

During early testing, slower response times, inconsistent performance, and problems with handling CAPTCHA were also observed, as per the report.

Open Computer Agent is currently accessible to the public, though users may have to wait in a virtual queue to get a chance to see it in action.

The report states that the experiment isn’t aimed at delivering a flawless product. Instead, Hugging Face’s goal is to show that open models are becoming increasingly competent and more affordable to run in the cloud. 

Aymeric Roucher, the project lead for building Agents at Hugging Face, took to X to announce the agent and shared an example to describe its capability, “I asked it how long the soldiers from Alexander had walked from their departure in Macedonia, all the way to India, when they decided they were too tired to continue. It turns out, they had walked quite a bit!”

This can be described as an alternative to OpenAI Operator, even though it has some differences. Operator can interact with websites like a human, navigating them, filling out forms, and making purchases. Unlike API-driven automation tools, Operator uses visual processing, controlling a virtual mouse and typing within a browser.

The post Hugging Face Launches Web-based AI Agent Similar to OpenAI’s Operator appeared first on Analytics India Magazine.