Last updated: 2026-04-18

AI Agent Glossary

Plain-English definitions for every term you'll hit while working with AI agents — protocols (MCP), files (SOUL.md, HEARTBEAT.md), patterns (RAG, batching), and the security concepts you need to set up safely. 40 entries, cross-linked to the guides where each term appears.

Core concepts

AI agent

An AI system that runs autonomously, uses tools (shell, APIs, files), and pursues multi-step goals without a human in the loop for each action. Distinct from a chatbot, which only replies.

Autonomy

The degree to which an agent acts without human approval. Ranges from assistive (proposes every action) to fully autonomous (runs 24/7 via cron and heartbeats). Higher autonomy = more utility but more security risk.

Batching

Processing N items in one LLM call instead of N separate calls. Halves per-item overhead and pairs perfectly with prompt caching. Typical batches: 10 text items or 5 long transcripts per call.

Benchmark

A standardized test comparing agent or LLM performance — SWE-bench for coding, GAIA for tool use, HumanEval for code generation. Always check methodology before trusting a ranking; benchmarks are often gamed or overfit.

Claude Haiku

Anthropic's small, fast, cheap Claude model — roughly 10× cheaper than Sonnet. Ideal for batched routine work (summarization, classification, scraping) where speed and cost matter more than peak reasoning.

Claude Opus

Anthropic's flagship Claude model — best reasoning, highest cost. Reserved for hard tasks: complex refactors, long-context analysis, multi-step planning. Overkill for routine agent loops.

Claude Sonnet

Anthropic's mid-tier Claude model — strong reasoning at reasonable cost. The default model for most agent frameworks when quality matters but Opus is overkill.

Context window

The amount of text an LLM can process in one pass, measured in tokens. Larger windows (200k+) let agents keep more of a project in memory; smaller windows force summarization. Directly impacts cost per call.

Cron (scheduled task)

A job that fires on a time-based schedule (e.g. every day at 9am). Agent platforms use cron to run recurring skills — daily news digests, weekly benchmark sweeps, overnight cleanups — without human prompting.

Embedding

A fixed-length vector representing the meaning of a piece of text. Similar meanings → close vectors. Generated by a separate embedding model (e.g. text-embedding-3-small) and stored in a vector store for fast semantic search.

Gateway

A local proxy between your agent and one or more LLM providers. Used for request logging, provider failover, rate limiting, and prompt-cache sharing across multiple projects.

Hallucination

When an LLM generates confident but false content — made-up API signatures, fake citations, invented file paths. RAG, tool use (verify by running), and "read the source" patterns all reduce but don't eliminate hallucinations.

LLM (Large Language Model)

The underlying neural network an agent uses to reason and generate text — e.g. Claude, GPT-4o, Qwen. The agent framework (OpenClaw, Hermes, etc.) is the scaffolding around an LLM that gives it tools, memory, and goals.

Local model

An LLM running on your own GPU instead of a hosted API — via Ollama, vLLM, or llama.cpp. Eliminates per-token cost and keeps data private, at the price of GPU hardware and slower inference.

Memory

Persistent state an agent carries across sessions — user preferences, past decisions, project history. Stored in files (Hermes `memory/`, OpenClaw workspace) or a database. Distinct from context window, which resets each session.

Model pinning

Specifying an exact model version (e.g. `claude-haiku-4-5-20251001`) instead of a floating alias. Prevents silent behavior changes when the provider updates the alias. Pin anything running in production.

Prompt caching

Reusing a previously-sent prompt prefix (system message, tool list, large context) at a fraction of the normal token cost. Anthropic's ephemeral cache has a 5-minute TTL. Pins system prompts to slash routine-job cost by 60–90%.

Provider

The company hosting the LLM an agent talks to — Anthropic, OpenAI, Google, Ollama Cloud, or a self-hosted local server. Most agent platforms let you swap providers without rewriting skills.

RAG (Retrieval-Augmented Generation)

Technique where the agent fetches relevant documents from a vector store or search index before answering, then grounds its response in them. Reduces hallucination and lets agents work over large corpora the LLM never saw during training.

Skill

A scoped capability an agent can invoke — typically a folder containing a SKILL.md describing when to use it plus optional scripts and reference docs. Skills are the preferred modular unit in OpenClaw and Claude Cowork.

Subagent

A specialized agent spawned by a parent agent to handle a focused subtask with its own context window. Common pattern for long-running work: the parent orchestrates, subagents do focused reads/writes without bloating the parent's context.

SWE-bench

Benchmark testing whether an agent can resolve real GitHub issues by reading the repo, writing a patch, and passing the project's tests. The closest thing to a "can it ship code?" score. Agents are ranked by pass rate on 2,294 issues.

System prompt

The hidden instruction block sent at the top of every LLM call that sets persona, rules, and available tools. Agent frameworks auto-assemble it from files like SOUL.md. Caching it is the single biggest cost saver.

Token

The unit LLMs read and bill in — roughly 3/4 of a word. A 1000-word page is ~1300 tokens. Agent platforms charge per input + output token, so token discipline (batching, caching, targeted reads) is the main cost lever.

Tool use

The mechanism by which an LLM invokes external functions — shell commands, HTTP calls, file edits — described to it in a structured schema. Every modern agent platform is built on tool use.

Vector store

Database optimized for similarity search over embeddings — the storage layer under most RAG setups. Examples: Chroma, LanceDB, Qdrant, Pinecone. Agents query it to find context relevant to the current task.

Protocols

MCP (Model Context Protocol)

Open standard that lets AI agents talk to external tools and data sources through a uniform server interface. Built by Anthropic, now supported by OpenClaw, Hermes, Claude Cowork, and IronClaw. Replaces per-tool integrations with one protocol.

Security

.env file

Plain-text config file that holds secrets (API keys, tokens, passwords) as KEY=value pairs. Loaded at agent startup, always listed in .gitignore. If a .env is ever committed, rotate every key in it immediately.

API key

Secret credential that authenticates your agent to an LLM provider. Store in a .env file (never in code), rotate every 30 days, and never commit to a public repo. Keys are the #1 accidental leak vector in agent setups.

DM policy

Setting that controls who can send an agent direct messages (Telegram, Slack, etc.). Values are typically `open` (anyone), `allowlist` (only listed contacts), or `closed`. Always set to `allowlist` for agents with tool access.

Prompt injection

Attack where malicious instructions hidden in external content (a web page, email, file) get treated by the agent as user commands. The #1 security risk for any agent that reads untrusted input. Mitigations: allowlists, user confirmation for sensitive actions, sandboxed tool scopes.

Sandbox

An isolated environment where an agent can run code, install packages, or execute commands without affecting the host system. Docker containers, Firejail, or a separate VM. Essential for agents that auto-execute shell commands.

Skill allowlist

A configuration file that limits which skills an agent may load — only entries on the list are permitted to run. Critical for security: a compromised skill from a public repo can't execute if it isn't allowlisted.

OpenClaw-specific

Agent workspace

The directory an agent treats as its writable scratch space — typically `~/.openclaw/workspace/` or similar. Holds drafts, downloaded files, skill state. Clean it periodically; agents accumulate junk.

Heartbeat

A recurring self-trigger that keeps an agent alive between external events — the agent wakes on a schedule (e.g. every 15 minutes), checks its inbox/calendar/queues, and acts if anything changed. In OpenClaw this is configured via HEARTBEAT.md.

openclaw doctor

Built-in OpenClaw diagnostic command. Checks config, connected providers, skill directory health, and permission setup in one pass. Run it first when anything breaks.

SOUL.md

OpenClaw's top-level personality and policy file. Defines the agent's name, tone, defaults, and hard rules (e.g. "never send email after 10pm"). Loaded at every session start. The single most important file to back up before upgrades.

Tools & runtimes

Git worktree

A Git feature that lets one repo have multiple working directories on different branches simultaneously. Agents use worktrees for parallel experiments — try a refactor in an isolated tree without polluting main.

Ollama

Popular runtime for local LLMs. One-command install, pulls open-weight models (Llama, Qwen, Mistral), exposes a local HTTP API at localhost:11434 that agent platforms can target as if it were a cloud provider.

Ollama Cloud

Hosted version of Ollama — same open-weight models, rented GPU capacity, one API key. Cloud Pro ($20/mo) is a popular mid-tier option for agent workloads that want open models without buying hardware.

Missing a term? See the latest news or browse the commands reference.