Last updated: 2026-05-10

ChatGPT Agent Mode — What It Does & How to Use It

Agent Mode is OpenAI's name for ChatGPT's autonomous task execution — it browses the web, runs code, opens files, and chains tools across multiple steps to complete a goal you describe in a single prompt. Where a regular ChatGPT response gives you text, Agent Mode gives you a finished artifact (a report, a comparison spreadsheet, a booking, a refactored codebase). This page covers what it actually does today, where it falls short, how to use it safely, and what it costs.

The 30-second answer

Agent Mode = ChatGPT + a sandboxed browser + Python + file tools + a planning loop. Available on Plus, Pro, Business, and Enterprise. Best for research, data wrangling, and form-filling. Not yet reliable for high-stakes financial or security-sensitive actions — and there's no full undo.

What Agent Mode can do today

  • Browse the live web — open URLs, click links, fill forms, screenshot pages, extract structured data. Reads JavaScript-rendered content (unlike API web-search, which is text-only).
  • Run code — Python in a sandboxed Code Interpreter environment. Read your uploaded files, process them, return results.
  • Use multiple tools in sequence — search → click → extract → process → output. The planner decides which tool to invoke for each sub-step.
  • Resume long tasks — Agent Mode runs can span 5–20 minutes of autonomous work. You can leave the tab and return to a finished result.
  • Hand off to the user when stuck — when it hits a login wall, CAPTCHA, or ambiguous instruction, it pauses and asks you a clarifying question before continuing.

What Agent Mode can't reliably do (yet)

  • Make purchases or financial transactions — Agent Mode will plan a checkout flow but stops at the "confirm purchase" step, asking you to complete it manually. This is a deliberate safety boundary, not a bug.
  • Access local files outside the chat — Agent Mode runs in OpenAI's sandbox, not on your machine. It can only see files you've explicitly uploaded to the conversation.
  • Use your existing browser sessions / cookies — every Agent Mode run starts with a clean browser. It cannot impersonate you on sites where you're already logged in.
  • Handle dynamic-only sites with strong bot detection — sites with aggressive anti-bot measures (Cloudflare Turnstile, hCaptcha) often block Agent Mode mid-task.
  • Maintain state across separate runs — each Agent Mode session is independent. If you want persistent memory across runs, see the Memory feature guide.

How to start an Agent Mode run

  1. Open ChatGPT (web app or desktop).
  2. Below the input box, click the tools menu (icon next to attachments).
  3. Select Agent Mode. The input area expands and shows "Agent Mode active."
  4. Describe the task in one prompt. The more concrete, the better — Agent Mode is bad at "make a website" but good at "find the 5 most-recent papers on retrieval-augmented generation, summarize each in two sentences, and put them in a table I can copy into Notion."
  5. Watch the planning panel on the right — it shows each sub-step (browse, click, extract, run code) as Agent Mode executes. You can stop at any time.
  6. When Agent Mode completes, it returns a structured summary plus any artifacts (markdown tables, downloadable files, screenshots).

Where Agent Mode genuinely earns its keep

From real-world testing in April 2026, these are the workflows where Agent Mode reliably beats both manual work and using regular ChatGPT:

  • Competitive research: "Compare the pricing pages of these 8 SaaS companies on these 4 features. Output a markdown table."
  • Data collection from public sources: "Pull the last 30 days of release notes from these 5 GitHub repos and group them by category."
  • Form-filling and intake: "Read this PDF, extract these 12 fields, and fill out the corresponding fields in this Google Form. Show me a screenshot before submitting."
  • Spreadsheet wrangling: "Read these 3 uploaded CSVs, find the rows where customer_id matches across all three, output a combined sheet."
  • Booking research (not booking itself): "Find me 5 flights from JFK to LHR next Friday under $700, return-trip, no overnight layovers." Agent Mode does the research; you do the booking.

Safety boundaries — what OpenAI built in

Agent Mode has explicit guardrails that prevent it from completing certain actions even if you ask:

Action categoryBehavior
Purchases / paymentsPlans the flow, stops at confirmation, asks user to complete manually
Posting public content (social media, forums)Drafts the post in the chat, never submits without explicit confirmation
Email sendingDrafts the email, never sends — you copy and send yourself
Account creation / signupRefuses; tells you to sign up yourself
Submitting forms with sensitive dataHalts at the field requesting SSN, payment, or ID and asks for confirmation
Following links from untrusted observed contentRefuses by default; treats observed instructions as untrusted

These boundaries follow the same logic as our cross-platform prompt-injection guidance. They limit Agent Mode's usefulness for some workflows but they're the right defaults given current LLM reliability.

How much it costs

Agent Mode usage counts against your ChatGPT tier's monthly cap, with per-tool-call surcharges layered on top:

TierAgent Mode runs/monthPer-tool-call
FreeNot available
Plus ($23/mo)~50 runsIncluded
Pro (~$200/mo)Unlimited "fair use"Included
Business (per seat)200+/seat/monthPooled across seats
EnterpriseCustomCustom

A typical Agent Mode run uses 8–15 tool calls. Browsing-heavy runs (research, comparison) burn more; code-heavy runs (file processing) burn fewer. See the pricing deep-dive for worked examples.

When to use Agent Mode vs a Custom GPT vs the API

Use caseBest fitWhy
One-off multi-step research taskAgent ModeNo setup, full browsing, completes autonomously
Repeated task with a stable system promptCustom GPTSave the prompt + tools, run it daily with one click
Programmatic / scripted taskOpenAI APILower per-token cost at volume, full control over the loop
Long-running scheduled taskHermes or OpenClawAgent Mode is interactive; long-horizon agents are a different category
Coding inside an IDEKilo Code or Claude CodeAgent Mode is not IDE-integrated; dedicated coding agents have file-tree context

Common gotchas

  • Agent Mode "drifts" on long runs. Beyond 15–20 minutes of autonomous work, output quality drops sharply. Break large tasks into focused sub-tasks.
  • Login walls stop it cold. Many target sites require authentication. Either feed Agent Mode the data directly (uploaded CSV, PDF) or pick public-data tasks.
  • The "browse the web" panel sometimes hangs. If a run shows no activity for >2 minutes, click "Stop" and restart. It usually completes the second try.
  • Output formatting drift. If you ask for a markdown table and Agent Mode returns prose, add "Output only a markdown table with these columns: X, Y, Z" to the prompt.
  • Privacy: Agent Mode interactions are logged in your ChatGPT history. For sensitive research, use a Project (Pro+) so the conversation is contained.

← Back to ChatGPT hub · Next: Memory feature →

📬 Weekly Digest — In Your Inbox

One email a week: top news, releases, and our deepest new guide. No spam. Same content via RSS if you prefer.