Published: 2026-06-26

Ornith: Open Agentic Coding Models (9B–397B) for Fully Local Agents

Chapters / key moments (click to jump — plays here on the page)

Ornith, from a newer company called Deep Reinforce, is a whole family of open coding models — 9B, 35B, and a 397B flagship — post-trained on top of Gemma and Qwen with a focus on coding, tool use, and agentic workflows. AICodeKing's hands-on take: the smaller models are genuinely useful local agents (the 9B is the most interesting for home machines), and the headline claim is that Ornith learns to build its own scaffold — how it plans, retries, recovers from errors, and uses tools — which matters far more for a coding agent than raw answer quality.

Source video

"Ornith (35B, 9B) + Hermes, Zed: THE FULLY PRIVATE LOCAL AGENT is ACTUALLY HERE!" by AICodeKingWatch on YouTube →

Key Takeaways

  • A family, not one model. 9B and 35B run on local machines/servers; the 397B flagship is cloud-scale. All are post-trained on Gemma and Qwen, aimed at coding, tool use, and agentic loops.
  • The pitch is learned scaffolding. Ornith claims to learn how it structures the process — planning, retrying, handling mistakes, checking files, running tests — not just how to write the answer. For agents, a good loop around a smaller model can beat a strong model with a bad harness.
  • Reported benchmarks: 397B ≈ 77 on Terminal-Bench / ≈ 82 on SWE-bench (near Opus-level in places); 35B ≈ 64 / ≈ 75; 9B ≈ 43 / ≈ 69 — strong on paper for a 9B. The creator stresses benchmarks aren't real-world usage, and Deep Reinforce says it fights reward-hacking with a fixed environment, monitoring, and an LLM judge on top of the normal verifier.
  • Runtime setup matters. Ornith is a reasoning model that emits reasoning tags and tool-call blocks; if your runtime mishandles them you'll see broken tool calls, raw reasoning text, or failed steps. Use a recent runtime, the proper chat template, and — on vLLM or SGLang — the correct reasoning and tool-call parser from the model card.
  • Hands-on verdict: tested the 35B with OpenCode and Hermes Agent — does web-search and tool calls cleanly, glitches far less than Qwen 3.6, and "looks like a straight-up gem" with Hermes. General chit-chat isn't its strength; it's built to be a solid everyday tool-calling assistant, and AICodeKing calls it one of the best local models right now.

Setup Notes Mentioned

# Easiest local route for the 9B/35B (GGUF quant):
#   - LM Studio or Ollama

# Serving via vLLM / SGLang:
#   - use a recent runtime + the model's chat template
#   - enable the recommended reasoning parser AND tool-call parser
#     (see the Ornith model card) so reasoning tags don't break tool calls

# Agents tested against it: OpenCode, Hermes Agent

Weekly Digest — In Your Inbox

Get the week's top AI agent news, updates, and guides — every Friday.