The operator of OpenAI can surf the Internet for you

The operator of OpenAI can surf the Internet for you


OpenAI has begun previewing a new tool called Operator that can navigate in a web browser. According to a blog post Published Thursdaythe software is operated by a so-called computer-using agent. “CUA is trained to interact with graphical user interfaces (GUIs) – the buttons, menus and text boxes that people see on a screen – just like people do,” OpenAI says of the model. “This gives it the flexibility to perform digital tasks without using operating system or web-specific APIs.”

The current version of Operator is based on OpenAI’s GPT-4o model. It combines the vision capabilities of this algorithm with “advanced reasoning” trained through reinforcement learning. The operator has the ability to “break tasks into multi-step plans and make adaptive self-corrections as challenges arise.” According to OpenAI, this capability represents the next stage in AI development.

The operator can interact with a variety of websites, including Instacart's ordering platform.  The operator can interact with a variety of websites, including Instacart's ordering platform.

Instacart

As with previous research previews, OpenAI warns that Operator is “still in its early stages and has limitations” and that it “will not yet work reliably in all scenarios.” For example, depending on the complexity of the task and the interface involved, the agent will benefit significantly if the user takes a few extra moments to compose a more detailed prompt. Per The edgeThe operator gives control to the user if they ever get stuck on a task. Additionally, control is handed over whenever a website asks for sensitive information, including login information. The company says it developed the tool to “reject malicious requests and block objectionable content.”

OpenAI makes Operator available to $200 per month users for the first time ChatGPT Pro subscription. It also works with companies like Instacart To offer the agent on their platforms, you will also need a ChatGPT Pro subscription to test the integration.

The operator joins a growing list of AI agents that can navigate either a web browser or an entire operating system. Anthropic was the first company to offer this feature with the release of Claude 3.5 Sonnet model in Octobermore recently, Google followed suit with its Gemini 2.0 model and Project Mariner.

If you purchase something through a link in this article, we may receive a commission.



Source link

Spread the love
Leave a Comment

Comments

No comments yet. Why don’t you start the discussion?

Leave a Reply

Your email address will not be published. Required fields are marked *