Published: 2026-04-17

Qwen 3.6 + OpenClaw: Full Agentic Coding Locally for Free

Fahd Mirza connects OpenClaw to a locally-served Qwen 3.6-35B mixture-of-experts model (vLLM backend, NVIDIA H100, zero API cost) and gives it a single prompt: build a full industrial dashboard for a cement plant. The result — a working React + Vite + TypeScript app with autonomous bug detection, browser control, and Git management — demonstrates what the agentic loop looks like when there's no cloud metering constraining how far the agent runs.

Source video

"Qwen3.6-35B-A3B + OpenClaw - Agentic Coding Locally for Free" by Fahd Mirza — Watch on YouTube →

Security note

As Fahd demonstrates in the video, OpenClaw has full control of the system: it can clear browser caches, restart applications, and execute arbitrary shell commands. Run it only on machines and with accounts you are comfortable giving an agent unrestricted access to. See OpenClaw Security for setup recommendations.

Key Takeaways

OpenClaw supports custom model backends via the vLLM provider option at setup. With a locally-served model there is no API key, no per-token cost, and no rate limits — the agent runs as long as the GPU has memory and compute. This dramatically changes what autonomous agentic sessions are economically viable.
The setup process: install OpenClaw → select Quick Start → choose vLLM provider → enter local server address → specify model name → skip channels and non-essential skills → done. Fahd completes this in under 5 minutes on a fresh Ubuntu system.
The task prompt was designed to test the full agentic loop, not just code generation: "create directories, run shell commands, execute npm install, start a live server, manage Git." The question was whether Qwen 3.6, through OpenClaw's harness, could operate autonomously on a real Linux system and deliver a running application from a single instruction.
OpenClaw built the dashboard, hit a CSS rendering error, and without human instruction: detected the failure, cleared the browser cache, restarted the browser, fixed the stylesheet, and reloaded the page. This is the autonomous debugging loop — the agent treats failed renders as feedback and iterates until it passes visual inspection.
Final output: a dark-theme industrial dashboard with live sensor data simulation, alert management, production statistics, and threshold controls — plus a generated README and test files — all from one prompt with two manual follow-up corrections for layout.
Hardware requirement: NVIDIA H100 with 80GB VRAM (rented via Mass Compute in the video). Smaller models require less VRAM, but the quality of autonomous reasoning drops significantly below certain thresholds. 16GB VRAM is a reasonable floor for useful agentic work.

The Local Model Workflow

Fahd's setup: the Qwen 3.6-35B-A3B model is served via vLLM on the same Ubuntu machine where OpenClaw is installed. OpenClaw treats the local vLLM endpoint exactly as it would treat the Anthropic API — the agent loop, tool calls, file management, and terminal execution are all identical. Only the model and cost profile change.

The implication: anything you build or learn with Claude-backed OpenClaw transfers directly to a local model setup. The agent harness is the constant; the model is a swappable component. If API costs are a bottleneck on your agentic workload, this is the architecture to evaluate.

Related on OpenClawDatabase

NemoClaw Hub — NVIDIA's agent platform with local model capabilities
NemoClaw Local GPU Guide — running NemoClaw agents on local GPU
OpenClaw Security — configuring safe system access for your agent
OpenClaw Cost Reduction via NVIDIA NIM — cloud-based local GPU offloading as an alternative

← Back to News digest · See also: OpenClaw guide