Incident Response — what to do when the agent goes wrong
Playbook for the inevitable day your agent does something it shouldn't. Speed matters — the first hour is everything.
The threat
You notice an email went out that shouldn't have. A file was deleted. A commit was pushed. The longer it takes to contain, the worse the blast radius gets. Most incidents compound because people freeze instead of executing a pre-written playbook.
What to do about it
-
1. Kill the agent first, investigate second
Stop the routine, kill the process, revoke the OAuth token. You can restart a killed agent in 10 seconds; you can't un-send emails.
-
2. Rotate every credential the agent had
API keys, OAuth tokens, session cookies. Assume they're all burned.
-
3. Pull the transcript/log immediately
Before you do anything else destructive (uninstalling, reinstalling), export the logs. They're your only forensic record of what happened.
-
4. Identify the initial injection/trigger
If you can't find the root cause, you'll reintroduce it. Look at what the agent was reading right before the bad action.
-
5. Document and share
Your incident becomes someone else's prevention. Write it up (redacted) and post it to community forums. The ecosystem needs this data.
Real-world examples
- A user's agent sent 40 stale draft emails when a memory refresh triggered an unexpected send action. Containment in 3 minutes; the other 37 drafts were caught.
- A compromised MCP server was in use for 6 days before detection. Full credential rotation + repo audit took a weekend.
Examples are illustrative, composited from public incident reports and community posts.
Applies to
OpenClaw · NemoClaw · IronClaw · Hermes · Claude Cowork · ChatGPT
← Back to the security hub · See also the hardening checklist.