Published: 2026-06-13

Kimi K2.7 Code Inside Hermes: One-Prompt, End-to-End Agentic Coding

Chapters / key moments (click to jump — plays here on the page)

Fahd Mirza puts Moonshot AI's new Kimi K2.7 Code model through two real tasks driven by the Hermes agent. It's a native multimodal mixture-of-experts model (~1T total parameters, ~32B active) with a 256k-token context window that always operates in thinking mode. In both tests it ran the full engineering loop from a single prompt: read the repo, plan, build, debug its own failures, verify with live commands, and clean up after itself.

Source video

"Kimi K2.7 Code + Hermes Agent - Clinically Certified to Be Insane" by Fahd MirzaWatch on YouTube →

Key Takeaways

  • What Kimi K2.7 Code is: a multimodal MoE model (text, image, video input) with ~1 trillion total parameters and ~32B active, a 256k context window, and an always-on thinking mode that preserves reasoning across multi-turn conversations. It targets long-horizon, agentic coding.
  • How it's run here: not locally (the full model needs a multi-GPU cluster) but via Kimi's paid API key, driven through the Hermes agent. You can use any agent, including Moonshot's own Kimi CLI.
  • Test 1 — refactor a real app: against a 4,000+ file SSL-certificate monitoring repo, one prompt added a priority system, wired it into the live UI, and made the CLI configurable. The model read files in parallel, built an 8-step to-do list, hit a failing test (exit 1), debugged an off-by-one error in the threshold comparison, rewrote the assertions, re-ran until green, worked around a terminal backgrounding issue, and verified the API with live curl calls — in 7m47s with zero human intervention.
  • Test 2 — image to multilingual simulations: given an aerodynamics diagram (lift/drag, pressure plots, Reynolds data), it extracted the values and built five working physics simulations in five languages (Spanish, German, Arabic, Hindi, Portuguese), then used browser-use to open and visually check its own generated HTML.
  • The bigger point: Mirza frames this as the shift from "code creators" to "code directors" — your job becomes giving the goal, then verifying and testing what the agent returns. He considers K2.6 effectively retired and teases an Opus 4.8 comparison.

Commands & Code Mentioned

# Verify the live SSL-monitoring API endpoint after the agent's changes
curl http://localhost:8765/...   # add a bad-SSL test domain (e.g. badssl.com) and watch the dashboard flag it

Weekly Digest — In Your Inbox

Get the week's top AI agent news, updates, and guides — every Friday.