OpenClaw Cost Optimisation — Keep Your Bill Under $10/Month
OpenClaw doesn't ship with API rate limiting, token budgets, or spend caps. The agent will happily call your API as many times as it needs. Without intervention, a default Opus setup with heartbeats and automation can reach $100+/month fast. This guide shows you exactly how to fix that — most users land under $10/month with these changes.
Set a hard monthly spending limit in your provider console right now, before reading further. Anthropic: Console → Settings → Plans & Billing → Monthly spending limit. OpenAI: Platform → Settings → Limits. Set it to 150% of what you're comfortable spending. This is your safety net while you tune everything else.
Why OpenClaw Bills Get Out of Control
Three culprits account for nearly all runaway costs:
- Wrong default model. If your default is Claude Opus (≈$15/M input, $75/M output), every heartbeat, every cron job, every quick question costs 50× more than it needs to. Claude Haiku is $0.25/M input. That's the same ratio as driving a Ferrari to buy milk.
- Context accumulation. Every conversation turn sends the entire session history to the model. A session that's been running for a week might add 50,000 tokens of context to every new message. This is the #1 cost driver — responsible for 40–50% of typical bills.
- Uncontrolled heartbeats. The heartbeat runs on a schedule and touches the API every cycle. If you're running Opus heartbeats every 30 minutes, that's 48 API calls per day before you've typed a single message.
Step 1 — Model Tiering (saves 50–80%)
The single most impactful change: stop using the same model for everything. Route cheap tasks to cheap models, reserve expensive models for tasks that actually need them.
Model Cost Reference (2026-04-06)
| Tier | Model | Input cost/M tokens | Best for |
|---|---|---|---|
| Free | Local Ollama (Qwen 2.5, Llama 3.2) | $0 | Anything that can run locally; full privacy |
| Budget | Claude Haiku 4.5 | ~$0.25 | Heartbeats, status checks, quick questions |
| Budget | Gemini Flash | ~$0.30 | Same as Haiku; good for high-volume automation |
| Mid | Claude Sonnet 4.6 | ~$3 | Complex reasoning, code review, multi-step tasks |
| Mid | GPT-4.1 | ~$2 | General use; good balance of cost and quality |
| Premium | Claude Opus 4.6 | ~$15 | Hardest reasoning tasks only; use sparingly |
Config: Set a Cheap Default with Escalation
{
agents: {
defaults: {
// Haiku handles daily conversation; escalate to Sonnet only when needed
model: {
primary: "anthropic/claude-haiku-4-5",
fallbacks: ["anthropic/claude-sonnet-4-6"]
},
// The models list is the allowlist for /model switching
// Users (or the agent itself) can escalate but must return explicitly
models: {
"anthropic/claude-haiku-4-5": { alias: "Haiku (default)" },
"anthropic/claude-sonnet-4-6": { alias: "Sonnet" },
"anthropic/claude-opus-4-6": { alias: "Opus (expensive!)" }
}
}
}
}
With this config, tell your agent in your SOUL.md: "You default to Haiku. For tasks requiring deep reasoning or long-context analysis, you may request Sonnet. You never switch to Opus without asking me first."
Step 2 — Fix the Heartbeat (saves 20–40% for automation users)
The heartbeat pattern that works: use the cheapest model to do the status check, escalate to a real model only when the check finds something that needs attention.
{
agents: {
defaults: {
heartbeat: {
every: "30m", // how often to check in
target: "last" // send to most recent session
}
}
}
}
Then in your HEARTBEAT.md workspace file, write the heartbeat instruction to use a cheap model:
# HEARTBEAT INSTRUCTIONS
You are running a lightweight status check, not a full session.
Use the cheapest available model (Haiku or equivalent).
Check these things:
1. Is disk usage below 80% on all partitions? (run: df -h)
2. Are there any ERROR lines in the last 50 lines of /var/log/app.log?
3. Is the gateway process still running?
If ALL checks pass: respond only with "HEARTBEAT_OK" — no other output.
If ANY check fails: escalate to Sonnet, alert me via Telegram with details.
Do NOT summarize, do NOT elaborate when everything is fine.
The goal is near-zero tokens when nothing is wrong.
A Haiku heartbeat that returns "HEARTBEAT_OK" costs fractions of a cent. Multiplied by 48 runs/day that's still essentially free — versus an Opus heartbeat that could cost $1+/day unprompted.
To disable heartbeats entirely for development:
{ agents: { defaults: { heartbeat: { every: "0" } } } }
Step 3 — Manage Context Accumulation (saves 40–50%)
Every message in a session gets sent to the model on the next turn. A week-old session might have 50,000+ tokens of history arriving with every new request. Three fixes:
3a. Session resets on a schedule
{
session: {
reset: {
mode: "daily", // wipe history at a set time each day
atHour: 4, // 4 AM — when you're asleep anyway
idleMinutes: 120 // also reset after 2h of inactivity
}
}
}
3b. Isolated sessions for cron jobs
Cron jobs that run on a schedule should never accumulate history. Use --session isolated in your heartbeat/cron task config so each run starts fresh:
# In HEARTBEAT.md or cron task definition:
# Use --session isolated so this task doesn't grow a long conversation history
openclaw run --session isolated "Check disk usage and return status"
3c. Shorten the system prompt
Your system prompt (SOUL.md + AGENTS.md contents) is injected on every message. A 5,000-word SOUL.md adds tokens to every single API call. Keep SOUL.md under 800 words, AGENTS.md under 1,200. Use MEMORY.md for facts that only need to load occasionally.
Step 4 — Limit Tool Definitions
Every skill enabled for an agent adds its tool definition to the input tokens on every API call — even if you never use that skill in a given session. Each tool definition costs roughly 200–500 tokens.
{
agents: {
list: [
{
id: "main",
// Only enable skills you actually use in daily conversation
skills: ["weather", "daily-brief", "notes"]
// Don't add github, himalaya, discord unless you need them every day
},
{
id: "dev",
skills: ["github", "shell", "skill-creator"]
// Keep the heavy dev tools in a separate agent
}
]
}
}
Step 5 — Local Models via Ollama (cost = $0)
For heartbeats, status checks, and routine tasks that don't need cloud-model quality, a local Ollama model costs nothing beyond electricity.
# Install Ollama
curl -fsSL https://ollama.com/install.sh | sh
# Pull a small, fast model
ollama pull llama3.2:3b # 3B params — fast even on CPU
ollama pull qwen2.5:7b # 7B params — better quality, needs 8GB RAM
// In openclaw.json — add Ollama as a provider
{
agents: {
defaults: {
models: {
"ollama/llama3.2:3b": { alias: "Local (fast)" },
"ollama/qwen2.5:7b": { alias: "Local (quality)" },
"anthropic/claude-haiku-4-5": { alias: "Haiku" }
},
model: {
primary: "ollama/llama3.2:3b" // Use local for all routine tasks
}
}
}
}
A 3B or 7B local model is good for: status checks, simple Q&A, reading files, running commands. It struggles with: complex multi-step reasoning, long-context analysis, nuanced writing. Route those to Haiku or Sonnet. The pattern: local model as gatekeeper, cloud model as specialist.
What Savings Look Like in Practice
| Setup | Typical monthly cost |
|---|---|
| Default install, Opus everywhere, heartbeats enabled | $60–$200+ |
| After Step 1 only (Haiku default) | $8–$30 |
| After Steps 1–3 (tiering + context + heartbeat) | $3–$12 |
| With local Ollama for routine tasks | $1–$5 |
| Full local (Ollama only, no cloud) | $0 (electricity only) |
Numbers based on typical personal-assistant usage: ~20 conversations/day, 4 heartbeats/hour, one daily cron digest. Heavy users will be higher; light users lower.
Tracking Your Usage
- Run
session_statusto see tokens and model used per session. - Check your provider console weekly — Anthropic and OpenAI both show per-day spend graphs.
- Add a daily cost check to your
HEARTBEAT.md: ask the agent to check your spend via the provider API and alert if it exceeds a threshold. - Set the hard provider limit (Step 0 above) as your backstop. Even if everything else fails, this cap saves you.
← Back to OpenClaw hub · See also: Configuration Reference · Quick Start