AI Agent Glossary
Plain-English definitions for every term you'll hit while working with AI agents — protocols (MCP), files (SOUL.md, HEARTBEAT.md), patterns (RAG, batching), and the security concepts you need to set up safely. 48 entries, cross-linked to the guides where each term appears.
Core concepts
AI agent
An AI system that runs autonomously, uses tools (shell, APIs, files), and pursues multi-step goals without a human in the loop for each action. Distinct from a chatbot, which only replies.
Autonomy
The degree to which an agent acts without human approval. Ranges from assistive (proposes every action) to fully autonomous (runs 24/7 via cron and heartbeats). Higher autonomy = more utility but more security risk.
Batching
Processing N items in one LLM call instead of N separate calls. Halves per-item overhead and pairs perfectly with prompt caching. Typical batches: 10 text items or 5 long transcripts per call.
Benchmark
A standardized test comparing agent or LLM performance — SWE-bench for coding, GAIA for tool use, HumanEval for code generation. Always check methodology before trusting a ranking; benchmarks are often gamed or overfit.
Claude Haiku
Anthropic's small, fast, cheap Claude model — roughly 10× cheaper than Sonnet. Ideal for batched routine work (summarization, classification, scraping) where speed and cost matter more than peak reasoning.
Claude Opus
Anthropic's flagship Claude model — best reasoning, highest cost. Reserved for hard tasks: complex refactors, long-context analysis, multi-step planning. Overkill for routine agent loops.
Claude Sonnet
Anthropic's mid-tier Claude model — strong reasoning at reasonable cost. The default model for most agent frameworks when quality matters but Opus is overkill.
Context window
The amount of text an LLM can process in one pass, measured in tokens. Larger windows (200k+) let agents keep more of a project in memory; smaller windows force summarization. Directly impacts cost per call.
Cron (scheduled task)
A job that fires on a time-based schedule (e.g. every day at 9am). Agent platforms use cron to run recurring skills — daily news digests, weekly benchmark sweeps, overnight cleanups — without human prompting.
Embedding
A fixed-length vector representing the meaning of a piece of text. Similar meanings → close vectors. Generated by a separate embedding model (e.g. text-embedding-3-small) and stored in a vector store for fast semantic search.
Fine-tuning
Continuing the training of a base LLM on your own labeled examples to specialize its behavior — different from prompt engineering (changing instructions) or RAG (providing context at inference time). Useful when you need a specific tone, format, or domain vocabulary the base model can't reliably hit via prompting alone. Available through the OpenAI API and some open-weight models; not available in ChatGPT or Claude Cowork. Costs 100×-1000× a regular inference call but is a one-time cost.
Gateway
A local proxy between your agent and one or more LLM providers. Used for request logging, provider failover, rate limiting, and prompt-cache sharing across multiple projects.
Hallucination
When an LLM generates confident but false content — made-up API signatures, fake citations, invented file paths. RAG, tool use (verify by running), and "read the source" patterns all reduce but don't eliminate hallucinations.
LLM (Large Language Model)
The underlying neural network an agent uses to reason and generate text — e.g. Claude, GPT-4o, Qwen. The agent framework (OpenClaw, Hermes, etc.) is the scaffolding around an LLM that gives it tools, memory, and goals.
Local model
An LLM running on your own GPU instead of a hosted API — via Ollama, vLLM, or llama.cpp. Eliminates per-token cost and keeps data private, at the price of GPU hardware and slower inference.
Memory
Persistent state an agent carries across sessions — user preferences, past decisions, project history. Stored in files (Hermes `memory/`, OpenClaw workspace) or a database. Distinct from context window, which resets each session.
Model pinning
Specifying an exact model version (e.g. `claude-haiku-4-5-20251001`) instead of a floating alias. Prevents silent behavior changes when the provider updates the alias. Pin anything running in production.
Prompt caching
Reusing a previously-sent prompt prefix (system message, tool list, large context) at a fraction of the normal token cost. Anthropic's ephemeral cache has a 5-minute TTL. Pins system prompts to slash routine-job cost by 60–90%.
Provider
The company hosting the LLM an agent talks to — Anthropic, OpenAI, Google, Ollama Cloud, or a self-hosted local server. Most agent platforms let you swap providers without rewriting skills.
RAG (Retrieval-Augmented Generation)
Technique where the agent fetches relevant documents from a vector store or search index before answering, then grounds its response in them. Reduces hallucination and lets agents work over large corpora the LLM never saw during training.
Rate limit (HTTP 429)
A cap on how many requests or tokens you can send in a given window (per minute, per day). When exceeded, the provider returns HTTP 429 'Too Many Requests' with an x-ratelimit-reset header telling you when the window resets. Common causes: heartbeat firing too often, retry loops after errors, free-tier daily cap, or provider-wide throttling. Mitigations: provider fallback in config, exponential backoff on retries, cheaper models for routine tasks, or upgrading your plan tier.
Skill
A scoped capability an agent can invoke — typically a folder containing a SKILL.md describing when to use it plus optional scripts and reference docs. Skills are the preferred modular unit in OpenClaw and Claude Cowork.
Subagent
A specialized agent spawned by a parent agent to handle a focused subtask with its own context window. Common pattern for long-running work: the parent orchestrates, subagents do focused reads/writes without bloating the parent's context.
SWE-bench
Benchmark testing whether an agent can resolve real GitHub issues by reading the repo, writing a patch, and passing the project's tests. The closest thing to a "can it ship code?" score. Agents are ranked by pass rate on 2,294 issues.
System prompt
The hidden instruction block sent at the top of every LLM call that sets persona, rules, and available tools. Agent frameworks auto-assemble it from files like SOUL.md. Caching it is the single biggest cost saver.
Token
The unit LLMs read and bill in — roughly 3/4 of a word. A 1000-word page is ~1300 tokens. Agent platforms charge per input + output token, so token discipline (batching, caching, targeted reads) is the main cost lever.
Tool use
The mechanism by which an LLM invokes external functions — shell commands, HTTP calls, file edits — described to it in a structured schema. Every modern agent platform is built on tool use.
Vector store
Database optimized for similarity search over embeddings — the storage layer under most RAG setups. Examples: Chroma, LanceDB, Qdrant, Pinecone. Agents query it to find context relevant to the current task.
Protocols
MCP (Model Context Protocol)
Open standard that lets AI agents talk to external tools and data sources through a uniform server interface. Built by Anthropic, now supported by OpenClaw, Hermes, Claude Cowork, and IronClaw. Replaces per-tool integrations with one protocol.
Security
.env file
Plain-text config file that holds secrets (API keys, tokens, passwords) as KEY=value pairs. Loaded at agent startup, always listed in .gitignore. If a .env is ever committed, rotate every key in it immediately.
API key
Secret credential that authenticates your agent to an LLM provider. Store in a .env file (never in code), rotate every 30 days, and never commit to a public repo. Keys are the #1 accidental leak vector in agent setups.
DM policy
Setting that controls who can send an agent direct messages (Telegram, Slack, etc.). Values are typically `open` (anyone), `allowlist` (only listed contacts), or `closed`. Always set to `allowlist` for agents with tool access.
Prompt injection
Attack where malicious instructions hidden in external content (a web page, email, file) get treated by the agent as user commands. The #1 security risk for any agent that reads untrusted input. Mitigations: allowlists, user confirmation for sensitive actions, sandboxed tool scopes.
Sandbox
An isolated environment where an agent can run code, install packages, or execute commands without affecting the host system. Docker containers, Firejail, or a separate VM. Essential for agents that auto-execute shell commands.
Skill allowlist
A configuration file that limits which skills an agent may load — only entries on the list are permitted to run. Critical for security: a compromised skill from a public repo can't execute if it isn't allowlisted.
Compliance & privacy
DPA (Data Processing Agreement)
A contract between a customer and an AI vendor governing how the vendor processes customer data — what's collected, where it's stored, who can access it, retention policies, sub-processors, and breach notification. Required by GDPR for any EU customer; commonly required by US enterprise procurement too. ChatGPT Enterprise and Claude Cowork Enterprise both offer custom DPAs; Plus and Pro tiers do not. HIPAA-regulated organizations also need a BAA on top of the DPA.
SCIM (System for Cross-domain Identity Management)
Open protocol for automatically syncing user accounts between your identity provider (Okta, Azure AD, Google Workspace) and a SaaS tool. When an employee joins, SCIM auto-creates their account; when they leave, it auto-deprovisions. The #1 way to prevent the 'ex-employee still has access' security failure. ChatGPT Enterprise supports SCIM; Business tier doesn't.
OpenClaw-specific
Agent workspace
The directory an agent treats as its writable scratch space — typically `~/.openclaw/workspace/` or similar. Holds drafts, downloaded files, skill state. Clean it periodically; agents accumulate junk.
Heartbeat
A recurring self-trigger that keeps an agent alive between external events — the agent wakes on a schedule (e.g. every 15 minutes), checks its inbox/calendar/queues, and acts if anything changed. In OpenClaw this is configured via HEARTBEAT.md.
openclaw doctor
Built-in OpenClaw diagnostic command. Checks config, connected providers, skill directory health, and permission setup in one pass. Run it first when anything breaks.
SOUL.md
OpenClaw's top-level personality and policy file. Defines the agent's name, tone, defaults, and hard rules (e.g. "never send email after 10pm"). Loaded at every session start. The single most important file to back up before upgrades.
Platform features
Agent Mode (ChatGPT)
ChatGPT's autonomous task execution capability — it browses the live web, runs Python in a sandbox, fills forms, and chains tools across multiple steps to complete a goal from a single prompt. Available on Plus, Pro, Business, and Enterprise. Stops at irreversible actions (purchases, posting, sending) and asks for human confirmation. See our /chatgpt/agent-mode/ guide for full capabilities, limits, safety boundaries, and cost.
Custom GPT
A saveable, shareable specialized version of ChatGPT — you set the system prompt, attach knowledge files, configure tools and actions, and either keep it private or publish to your workspace or the public GPT store. Custom GPTs are the closest ChatGPT analog to a 'skill' on OpenClaw. Don't get personal Memory entries (Memory is account-scoped, not GPT-scoped).
Temporary Chat (ChatGPT)
A ChatGPT conversation that isn't saved to history, doesn't load or write Memory entries, and isn't used for training. Useful for one-off sensitive tasks, testing 'no-memory' behavior, or anything you don't want shaping future responses. Set via the conversation menu in the ChatGPT UI. The conversation persists only as long as the tab is open.
Tools & runtimes
Git worktree
A Git feature that lets one repo have multiple working directories on different branches simultaneously. Agents use worktrees for parallel experiments — try a refactor in an isolated tree without polluting main.
Ollama
Popular runtime for local LLMs. One-command install, pulls open-weight models (Llama, Qwen, Mistral), exposes a local HTTP API at localhost:11434 that agent platforms can target as if it were a cloud provider.
Ollama Cloud
Hosted version of Ollama — same open-weight models, rented GPU capacity, one API key. Cloud Pro ($20/mo) is a popular mid-tier option for agent workloads that want open models without buying hardware.
Services & providers
OpenRouter
A unified API gateway that routes requests to 500+ LLMs (Claude, GPT, Gemini, Kimi, Qwen, DeepSeek, and many open-weight models) through a single endpoint and credential. No per-model markup — a dollar of OpenRouter credit equals a dollar of provider usage. Used by Kilo Code natively, supported as a provider by OpenClaw and most other agent platforms. OpenRouter's coding-app leaderboard is the closest thing the industry has to a real-world agent usage ranking; we publish a monthly analysis at /news/openrouter-monthly/.
Missing a term? See the latest news or browse the commands reference.