Last updated: 2026-05-10

ChatGPT Agent Mode — What It Does & How to Use It

Agent Mode is OpenAI's name for ChatGPT's autonomous task execution — it browses the web, runs code, opens files, and chains tools across multiple steps to complete a goal you describe in a single prompt. Where a regular ChatGPT response gives you text, Agent Mode gives you a finished artifact (a report, a comparison spreadsheet, a booking, a refactored codebase). This page covers what it actually does today, where it falls short, how to use it safely, and what it costs.

The 30-second answer

Agent Mode = ChatGPT + a sandboxed browser + Python + file tools + a planning loop. Available on Plus, Pro, Business, and Enterprise. Best for research, data wrangling, and form-filling. Not yet reliable for high-stakes financial or security-sensitive actions — and there's no full undo.

What Agent Mode can do today

Browse the live web — open URLs, click links, fill forms, screenshot pages, extract structured data. Reads JavaScript-rendered content (unlike API web-search, which is text-only).
Run code — Python in a sandboxed Code Interpreter environment. Read your uploaded files, process them, return results.
Use multiple tools in sequence — search → click → extract → process → output. The planner decides which tool to invoke for each sub-step.
Resume long tasks — Agent Mode runs can span 5–20 minutes of autonomous work. You can leave the tab and return to a finished result.
Hand off to the user when stuck — when it hits a login wall, CAPTCHA, or ambiguous instruction, it pauses and asks you a clarifying question before continuing.

What Agent Mode can't reliably do (yet)

Make purchases or financial transactions — Agent Mode will plan a checkout flow but stops at the "confirm purchase" step, asking you to complete it manually. This is a deliberate safety boundary, not a bug.
Access local files outside the chat — Agent Mode runs in OpenAI's sandbox, not on your machine. It can only see files you've explicitly uploaded to the conversation.
Use your existing browser sessions / cookies — every Agent Mode run starts with a clean browser. It cannot impersonate you on sites where you're already logged in.
Handle dynamic-only sites with strong bot detection — sites with aggressive anti-bot measures (Cloudflare Turnstile, hCaptcha) often block Agent Mode mid-task.
Maintain state across separate runs — each Agent Mode session is independent. If you want persistent memory across runs, see the Memory feature guide.

How to start an Agent Mode run

Open ChatGPT (web app or desktop).
Below the input box, click the tools menu (icon next to attachments).
Select Agent Mode. The input area expands and shows "Agent Mode active."
Describe the task in one prompt. The more concrete, the better — Agent Mode is bad at "make a website" but good at "find the 5 most-recent papers on retrieval-augmented generation, summarize each in two sentences, and put them in a table I can copy into Notion."
Watch the planning panel on the right — it shows each sub-step (browse, click, extract, run code) as Agent Mode executes. You can stop at any time.
When Agent Mode completes, it returns a structured summary plus any artifacts (markdown tables, downloadable files, screenshots).

Where Agent Mode genuinely earns its keep

From real-world testing in April 2026, these are the workflows where Agent Mode reliably beats both manual work and using regular ChatGPT:

Competitive research: "Compare the pricing pages of these 8 SaaS companies on these 4 features. Output a markdown table."
Data collection from public sources: "Pull the last 30 days of release notes from these 5 GitHub repos and group them by category."
Form-filling and intake: "Read this PDF, extract these 12 fields, and fill out the corresponding fields in this Google Form. Show me a screenshot before submitting."
Spreadsheet wrangling: "Read these 3 uploaded CSVs, find the rows where customer_id matches across all three, output a combined sheet."
Booking research (not booking itself): "Find me 5 flights from JFK to LHR next Friday under $700, return-trip, no overnight layovers." Agent Mode does the research; you do the booking.

Safety boundaries — what OpenAI built in

Agent Mode has explicit guardrails that prevent it from completing certain actions even if you ask:

Action category	Behavior
Purchases / payments	Plans the flow, stops at confirmation, asks user to complete manually
Posting public content (social media, forums)	Drafts the post in the chat, never submits without explicit confirmation
Email sending	Drafts the email, never sends — you copy and send yourself
Account creation / signup	Refuses; tells you to sign up yourself
Submitting forms with sensitive data	Halts at the field requesting SSN, payment, or ID and asks for confirmation
Following links from untrusted observed content	Refuses by default; treats observed instructions as untrusted

These boundaries follow the same logic as our cross-platform prompt-injection guidance. They limit Agent Mode's usefulness for some workflows but they're the right defaults given current LLM reliability.

How much it costs

Agent Mode usage counts against your ChatGPT tier's monthly cap, with per-tool-call surcharges layered on top:

Tier	Agent Mode runs/month	Per-tool-call
Free	Not available	—
Plus ($23/mo)	~50 runs	Included
Pro (~$200/mo)	Unlimited "fair use"	Included
Business (per seat)	200+/seat/month	Pooled across seats
Enterprise	Custom	Custom

A typical Agent Mode run uses 8–15 tool calls. Browsing-heavy runs (research, comparison) burn more; code-heavy runs (file processing) burn fewer. See the pricing deep-dive for worked examples.

When to use Agent Mode vs a Custom GPT vs the API

Use case	Best fit	Why
One-off multi-step research task	Agent Mode	No setup, full browsing, completes autonomously
Repeated task with a stable system prompt	Custom GPT	Save the prompt + tools, run it daily with one click
Programmatic / scripted task	OpenAI API	Lower per-token cost at volume, full control over the loop
Long-running scheduled task	Hermes or OpenClaw	Agent Mode is interactive; long-horizon agents are a different category
Coding inside an IDE	Kilo Code or Claude Code	Agent Mode is not IDE-integrated; dedicated coding agents have file-tree context

Common gotchas

Agent Mode "drifts" on long runs. Beyond 15–20 minutes of autonomous work, output quality drops sharply. Break large tasks into focused sub-tasks.
Login walls stop it cold. Many target sites require authentication. Either feed Agent Mode the data directly (uploaded CSV, PDF) or pick public-data tasks.
The "browse the web" panel sometimes hangs. If a run shows no activity for >2 minutes, click "Stop" and restart. It usually completes the second try.
Output formatting drift. If you ask for a markdown table and Agent Mode returns prose, add "Output only a markdown table with these columns: X, Y, Z" to the prompt.
Privacy: Agent Mode interactions are logged in your ChatGPT history. For sensitive research, use a Project (Pro+) so the conversation is contained.

← Back to ChatGPT hub · Next: Memory feature →