Published: 2026-06-26
Ornith: Open Agentic Coding Models (9B–397B) for Fully Local Agents
Chapters / key moments (click to jump — plays here on the page)
Ornith, from a newer company called Deep Reinforce, is a whole family of open coding models — 9B, 35B, and a 397B flagship — post-trained on top of Gemma and Qwen with a focus on coding, tool use, and agentic workflows. AICodeKing's hands-on take: the smaller models are genuinely useful local agents (the 9B is the most interesting for home machines), and the headline claim is that Ornith learns to build its own scaffold — how it plans, retries, recovers from errors, and uses tools — which matters far more for a coding agent than raw answer quality.
Source video
"Ornith (35B, 9B) + Hermes, Zed: THE FULLY PRIVATE LOCAL AGENT is ACTUALLY HERE!" by AICodeKing — Watch on YouTube →
Key Takeaways
- A family, not one model. 9B and 35B run on local machines/servers; the 397B flagship is cloud-scale. All are post-trained on Gemma and Qwen, aimed at coding, tool use, and agentic loops.
- The pitch is learned scaffolding. Ornith claims to learn how it structures the process — planning, retrying, handling mistakes, checking files, running tests — not just how to write the answer. For agents, a good loop around a smaller model can beat a strong model with a bad harness.
- Reported benchmarks: 397B ≈ 77 on Terminal-Bench / ≈ 82 on SWE-bench (near Opus-level in places); 35B ≈ 64 / ≈ 75; 9B ≈ 43 / ≈ 69 — strong on paper for a 9B. The creator stresses benchmarks aren't real-world usage, and Deep Reinforce says it fights reward-hacking with a fixed environment, monitoring, and an LLM judge on top of the normal verifier.
- Runtime setup matters. Ornith is a reasoning model that emits reasoning tags and tool-call blocks; if your runtime mishandles them you'll see broken tool calls, raw reasoning text, or failed steps. Use a recent runtime, the proper chat template, and — on vLLM or SGLang — the correct reasoning and tool-call parser from the model card.
- Hands-on verdict: tested the 35B with OpenCode and Hermes Agent — does web-search and tool calls cleanly, glitches far less than Qwen 3.6, and "looks like a straight-up gem" with Hermes. General chit-chat isn't its strength; it's built to be a solid everyday tool-calling assistant, and AICodeKing calls it one of the best local models right now.
Setup Notes Mentioned
# Easiest local route for the 9B/35B (GGUF quant):
# - LM Studio or Ollama
# Serving via vLLM / SGLang:
# - use a recent runtime + the model's chat template
# - enable the recommended reasoning parser AND tool-call parser
# (see the Ornith model card) so reasoning tags don't break tool calls
# Agents tested against it: OpenCode, Hermes Agent





