AI startup OpenAI has just launched an AI agent that it calls ‘Operator’, which is designed to automate everyday web tasks for users.

CEO Sam Altman in the launch event said that ‘Operator’ uses its own web browser to accomplish tasks that a user gives it. This could remove the need for a user for example to do their online shopping, book a holiday or restaurant reservation, or just fill out forms.

Operator has gone live for Pro users in the United States on Thursday at operator.chatgpt.com, and will be in other countries “soon”.

However its arrival in Europe “will take a while.”

Operator agent

Sam Altman also revealed that it is early days for ‘Operator’ as it is an ‘early research preview’ (meaning it still makes mistakes) and it will be improved. He also stated that OpenAI will launch more agents in the coming months.

“Operator is one of our first agents, which are AIs capable of doing work for you independently – you give it a task and it will execute it” said OpenAI.

“Operator can be asked to handle a wide variety of repetitive browser tasks such as filling out forms, ordering groceries, and even creating memes,” the firm stated. “The ability to use the same interfaces and tools that humans interact with on a daily basis broadens the utility of AI, helping people save time on everyday tasks while opening up new engagement opportunities for businesses.”

Operator is powered by a new model called Computer-Using Agent (CUA), which combines GPT-4o’s vision capabilities with advanced reasoning through reinforcement learning. CUA is trained to interact with graphical user interfaces (GUIs) – the buttons, menus, and text fields people see on a screen.

Operator can “see” (via screenshots) and “interact” (using all the actions a mouse and keyboard allow) with a browser, enabling it to take action on the web without requiring custom API integrations.

If it encounters challenges or makes mistakes, Operator can utilise its reasoning capabilities to self-correct. When it gets stuck and needs assistance, it simply hands control back to the user.

Users can choose to take over control of the remote browser at any point, and Operator is trained to proactively ask the user to take over for tasks that require login, payment details, or when solving CAPTCHAs.

Users can personalize their workflows in Operator by adding custom instructions.

Operator safety

OpenAI said it is collaborating with companies like DoorDash, Instacart, OpenTable, Priceline, StubHub, Thumbtack, Uber, and others to ensure Operator addresses real-world needs while respecting established norms.

OpenAI stressed that ensuring Operator is safe to use is a top priority, with three layers of safeguards to prevent abuse and ensure users are firmly in control.

First, Operator is trained to ensure that the person using it is always in control and asks for input at critical points.

Secondly, OpenAI has made it easy to manage data privacy in Operator.

And thirdly OpenAI said it has “built defences against adversarial websites that may try to mislead Operator through hidden prompts, malicious code, or phishing attempts.”