OpenAI is letting some users try out a new ChatGPT feature that takes advantage of it artificial intelligence use a web browser to book travel, shop for groceries, search for bargains, and perform many other online tasks.
The new tool, called Operator, is an AI agent: it is based on an AI model that is trained on both text and images to interpret commands and figure out how to execute them using a web browser. OpenAI claims that it has the potential to automate many mundane tasks and errands in everyday work life.
OpenAI’s operator follows the competing releases of both Google and Anthropic, who have demonstrated able to use the Internet. AI agents are widely viewed as the next stage of evolution for AI following chatbots, and many companies have jumped on the hype bandwagon by promoting them. In most cases, these are very limited in their capabilities and simply use a language model to automate things that are normally done with regular software.
“AI is evolving from a tool that could answer your questions to a tool that is also capable of taking global action and executing complex, multi-step workflows,” says Peter Welinder, VP of Product at OpenAI. “We will see a big impact on people’s productivity – but also on the quality of work people can do.”
OpenAI admits that granting ChatGPT access to a web browser introduces new risks, saying that the operator can sometimes misbehave. It says it has implemented various new security measures and plans to gradually expand the operator’s capabilities.
Welinder and Yash Kumar, product and development lead for OpenAI’s Computer Using Agent, say the plan is to learn from how people use the tool. They acknowledge that the tool could make unwanted bookings or purchases, but add that a lot of work has gone into making sure it asks before doing anything risky. “It will come back to me and seek confirmations before taking steps that may be irreversible,” Kumar says.
OpenAI also released a new “System Map” today that describes the issues you may encounter with Operator. This includes the possibility of commands being misunderstood or deviating from the user’s requirements. be misused by users; or become targeted by cybercriminals.
“It also presents an incredible number of security challenges,” says Kumar. “Because your attack vector area and your risk vector area increase significantly.”
Operator will initially be available as a “research preview” to ChatGPT users with a Pro account, which costs a hefty $200 per month. The company plans to expand access and roll out the tool slowly, as it will inevitably make some mistakes along the way.
In several demonstrations, Operator showed the potential for AI to take on a more active role as a web helper. The tool has a remote web browser and a chat window for communicating with a user.
At WIRED’s request, the operator was asked to book an Amtrak train ride from New Haven, Connecticut, to Washington, DC. It went to the correct website and correctly entered the necessary information to view the class schedule and then requested further instructions. If a user were logged into Amtrak’s website or browser profile with credit card information saved, the operator would be able to book a ticket – although it is designed to ask for permission first.
Kumar asked Operator to reserve a table at Beretta, a restaurant in San Francisco. The program went to the OpenTable website, found the right restaurant, and checked for availability before asking what to do next. OpenAI says it has worked with a number of popular websites, including OpenTable, to ensure Operator works smoothly on them.
The new tool is based on OpenAI’s GPT-4o AI model, which can perceive a browser and a web page and communicate in typed text. The tool includes additional training to help him understand how to complete tasks online. OpenAI will also make its Computer Use Agent available via its API.