Published: 2026-06-26

OpenJarvis + Ollama: A Local AI Agent That Tracks Watts Per Query

Name: OpenJarvis + Ollama: A Local AI Agent That Tracks Watts Per Query
Uploaded: 2026-06-26
Description: Fahd Mirza walks through OpenJarvis, Stanford's local-first AI agent framework, running on Ollama — install, presets, doctor health check, and a built-in energy benchmark.

Chapters / key moments (click to jump — plays here on the page)

Fahd Mirza picks up OpenJarvis — a local-first personal AI framework out of Stanford's Hazy Research and Scaling Intelligence Lab, built as part of their "Intelligence per Watt" research — and wires it up to Ollama on a single-GPU Ubuntu box. The walkthrough covers the one-line install, the seven built-in presets, an Open Claw–style doctor health check, and a benchmark that reports something most local-AI tools ignore: average watts and joules per token alongside speed.

Source video

"OpenJarvis + Ollama: Local AI Agent That Tracks Every Watt" by Fahd Mirza — Watch on YouTube →

Key Takeaways

Local-first by design. OpenJarvis assumes your personal AI should run on your own device, not someone else's cloud. It plugs straight into Ollama (and also supports vLLM, SGLang, and llama.cpp), reusing any models you've already pulled.
Five primitives. The framework is built around Intelligence (model selection and catalog), the Engine (the inference layer), Agents (single-turn chat up to multi-step reasoning and scheduled tasks), Memory (persistent, searchable storage), and a Learning system that records every interaction as a trace and uses it to improve routing over time.
Presets do the configuring for you. Seven ready-made presets ship in the box — including morning digest, deep research, and code assistant. Running jarvis init code-assistant swapped in a tuned prompt and harness, and produced a noticeably better script than the default agent on the same request.
Energy is a first-class metric. Per-query telemetry tracks token counts and tokens/sec, and jarvis bench run reports latency, throughput, and energy — in the demo, ~25 tokens/sec, ~278 W average draw, and ~10.9 joules per token, all at $0 cost because everything ran locally.
Early but evolving fast. The project is new — expect rough edges (a persistent self-update nag appeared even right after install). A doctor command (popularized by Open Claw) checks download status, the engine, and each primitive so you can confirm the setup before relying on it.

Commands & Code Mentioned

# Install is a single one-liner (uses uv under the hood to set everything up)

# Start an interactive chat with your default local model
jarvis

# Exit the chat
/quit

# Run a one-shot prompt against a specific local model
jarvis -m <model> "your prompt here"

# Health check — download status, engine, and all five primitives
jarvis doctor

# Initialize a ready-made preset (e.g. the code assistant)
# add --force to override an existing config
jarvis init code-assistant --force

# Benchmark latency, throughput, and energy use on your model
jarvis bench run

# Update OpenJarvis in place
jarvis self-update

OpenJarvis + Ollama: A Local AI Agent That Tracks Watts Per Query

Key Takeaways

Commands & Code Mentioned

More OpenClaw & Claude Code news

Go deeper: OpenClaw guides