pi-mono Deep Dive: The Minimalist Coding Agent for Operators (2026)
I started using pi-mono in April after Armin Ronacher's piece on it as the "minimal agent within OpenClaw." Across roughly six weeks of daily infra work, mostly Terraform plan diffs, k8s manifest edits, and shell-driven log triage, it took over the share of sessions where I previously reached for Claude Code first. The four-tool default (read, write, edit, bash) and JSONL-on-disk sessions made tasks like patching a broken Helm values file faster to drive than the larger Claude Code surface. This page is the write-up I wanted before I committed: the package surface, the default tools, the extension model, the security tradeoffs, and the tested configs.
TL;DR
Pick pi-mono if you want a coding agent whose entire surface area you can reason about in an afternoon and extend in TypeScript without forking. It fits engineers who already script their environment and want the agent to be one more composable thing on disk. Pick Claude Code if you want Anthropic to ship the tool list, the update cadence, and the integration surface, and you would rather wait for features than write them. Pick Cursor if your work happens inside an IDE and inline diff UI is on the critical path. Avoid pi-mono if you need built-in audit logs, RBAC, or a permission UI a security team can sign off on without reading TypeScript.
The mental model
Pi-mono is a TypeScript monorepo authored by Mario Zechner (badlogic on GitHub), MIT-licensed, at v0.73.0 as of 2026-05-04 (verified [2026-05-05] against https://github.com/badlogic/pi-mono). The five packages relevant to the coding-agent path all ship under the @mariozechner/ scope on npm: pi-ai (unified multi-provider LLM API), pi-agent-core (tool-calling and state), pi-coding-agent (the CLI most people mean when they say "pi"), pi-tui (terminal UI library with differential rendering), and pi-web-ui (chat web components). The repo's canonical package inventory lives in its README.
The mental model that matters: pi-coding-agent is a thin shell around a model and a small fixed pool of tools. By default, four are enabled: read, write, edit, bash. The built-in pool is slightly larger, grep, find, and ls are also available as tool options the operator can switch on (verified [2026-05-05] against https://github.com/badlogic/pi-mono/blob/main/packages/coding-agent/README.md). That is the entire built-in surface. There is no MCP layer, no sub-agent spawner, no plan mode, no permission popup, no background bash, no built-in todo list. Extras come from one of three places:
- CLI binaries on your
$PATH, with a README the agent reads when you tell it to. - TypeScript extensions installed into
~/.pi/agent/extensions/or per-project.pi/extensions/. - Skills, which are markdown files describing how to perform a task that the agent can load on demand.
The philosophical claim is "ask the agent to extend itself." In practice, that means: when you want a feature, you tell pi to write the extension, hand it a working directory, and let it ship code into your own config. That works because the primitive built-in tools are enough to author the rest.
The landscape
The coding-agent market in May 2026 has settled into roughly four buckets:
- IDE-integrated (Cursor, Windsurf, GitHub Copilot Chat). Editor-resident, GUI-first, polish over hackability.
- Vendor-paved CLIs (Claude Code, OpenAI Codex CLI). Curated tool sets, opinionated UX, closed-source agents on top of open SDKs. Anthropic and OpenAI ship these to anchor their model subscriptions.
- Open-source CLIs with strong defaults (Aider, Continue's CLI). Established, batteries-included, less malleable than their docs suggest.
- Minimalist + extensible CLIs (pi-mono, AMP). Small cores, explicit extension surfaces, run-from-anywhere ergonomics.
Pi-mono lives squarely in the fourth bucket. Its closest spiritual cousin is Aider in spirit but not in shape: Aider is a Python-first agent with a clear opinion about how diffs work; pi is a TypeScript-first agent with a clear opinion about how agents work.
What actually matters operationally
Vendor pages emphasize features. After running pi-mono daily, here is what actually drives the decision for production work:
Tool surface footprint. Every tool you load into the agent's context costs tokens on every turn. Pi defaults to four (read, write, edit, bash). Claude Code's default tool set is larger and includes file ops, shell, search, and a few orchestration tools, plus whatever MCP servers and hooks the operator wires in. I have not run a controlled tool-schema-token diff between pi v0.73.0 and the current Claude Code release, so the precise delta on a given turn is something to measure in your own setup rather than a number to quote here. The directional point still holds: a smaller default palette is easier to reason about turn over turn, which is the property pi is optimizing for.
Where state lives. Pi-mono stores sessions as JSONL files on disk, with tree structure for branching. The /tree, /fork, and /compact commands operate on local files you can cat, jq, and back up. Claude Code stores sessions inside its own directory structure too, but pi's bias is "the file is the source of truth" rather than "a UI mediates the file."
Provider portability. The pi-ai package abstracts a wide set of providers via API keys, including Anthropic, OpenAI, Azure OpenAI, Google Gemini and Vertex, Amazon Bedrock, DeepSeek, Mistral, Groq, Cerebras, Cloudflare, xAI, OpenRouter, Vercel, Hugging Face, Fireworks, ZAI, OpenCode, Kimi, MiniMax, and Xiaomi variants. Treat that list as a snapshot of what shipped in v0.73.0; the canonical, current list lives in the coding-agent README at https://github.com/badlogic/pi-mono/blob/main/packages/coding-agent/README.md and the providers doc, and pi adds providers often. Subscription credentials work for Claude Pro/Max, ChatGPT Plus/Pro, and GitHub Copilot, which is the underrated feature. Most operator setups already have one of these. Pi reuses them instead of demanding a fresh API key.
Permission model. Pi has no permission popup. The intentional position is: if you want sandbox safety, run it in a container or write an extension that gates bash. This is honest. It is also a security review nightmare in any org with a procurement process.
Extension cost. The honest framing is about what each tool lets you change. Claude Code is extensible at its integration surfaces: MCP servers, hooks, slash commands, settings, and wrapper scripts around the CLI. What you cannot do is fork the agent core, because it is closed. Pi inverts that. The agent core itself is TypeScript you can read, modify, and ship into your own config under .pi/extensions/. So the real choice is: are you extending around an opaque agent (Claude Code) or modifying the agent's own loop and tool surface (pi)? Both are legitimate. Which one wins depends entirely on whether your team writes TypeScript and how deep the change you want to make actually goes.
Lock-in. Pi is MIT, on npm, JSONL on disk, JSON config at known paths. If pi vanished tomorrow, your sessions are still grep-able and your extensions are still TypeScript. Cursor and Claude Code carry meaningfully heavier lock-in (proprietary agent code, vendor account binding, custom session formats).
Detailed teardowns
pi-mono (pi-coding-agent)
Position: minimalist + extensible CLI, the subject of this page.
Architecture: a TypeScript core that takes a model identifier, an authentication credential, and four tools, then runs a turn loop. Configuration is layered: ~/.pi/agent/settings.json global, .pi/settings.json per-project (overrides global), /settings interactive. Sessions are JSONL files arranged as trees so you can branch from any point without losing the parent thread.
Cost / scale / ops tradeoffs: per-token cost is whatever your underlying model provider charges, and pi adds zero gateway margin. Throughput is single-user single-machine. Pi is not a server; it is a CLI you run inside a terminal. There is no rate-limit pooling, no shared session storage, no team dashboards.
When it is the right call: solo or small-team operators who already script their dev environments, who want the agent to feel like one more ~/.config/ thing, and who would rather author a small TypeScript extension than file a feature request. Especially good for infrastructure work where the agent spends most of its time in bash and read and you want to know exactly what tools it can reach for. The actual extension footprint depends entirely on what the extension does: a thin bash allowlist wrapper is short, a real MCP-loading shim with auth and retries is not. I keep concrete reference extensions in the companion repo at https://github.com/MPIsaac-Per/agentinfra-examples so the size claim is checkable rather than rhetorical.
When it is the wrong call: regulated environments that need a permission UI, audit log, RBAC, and a vendor accountable for the agent's behavior. Pi has none of those by default and explicitly punts each to "build it yourself with extensions."
Install:
npm install -g @mariozechner/[email protected]
export ANTHROPIC_API_KEY=sk-ant-...
pi
Pin the exact version. Pi ships often (latest release 2026-05-04, verified by reading the GitHub Releases page at https://github.com/badlogic/pi-mono/releases on 2026-05-05); latest will move under you. To check release cadence yourself, the canonical query is gh release list --repo badlogic/pi-mono for tagged GitHub releases, or npm view @mariozechner/pi-coding-agent versions for the published npm versions of this specific package, which is the count that actually matters when pinning.
Claude Code
Position: vendor-paved CLI, Anthropic's flagship developer surface for Claude.
Architecture: a closed-source agent on top of the public Anthropic SDK with an opinionated set of tools, Anthropic-managed updates, and tight integration with Claude subscription credentials.
Tradeoffs: zero setup friction, broader default tool set, Anthropic supports it. Trades extensibility for paved-path UX. You wait for Anthropic to ship features.
When right: teams that want Claude as a daily driver and don't want to think about agent internals. People who value "this is supported by the model lab" over "I can read every line."
When wrong: anyone who wants to fork behavior. The agent is not open source. You are at Anthropic's product cadence.
Cursor
Position: IDE-integrated, GUI-first.
Architecture: a fork of VS Code with the agent stitched into the editor. Multi-model under the hood, but the surface is the editor itself.
Tradeoffs: best-in-class for editor-resident workflows; weaker for terminal-heavy infra work where the editor is incidental.
When right: front-end and full-stack work with lots of jumping between files visually, junior engineers who benefit from inline UI affordances.
When wrong: ops, infra, or pipeline work where the agent's main job is running shell commands and reading logs. Pi-mono is a better workflow fit there because it lives in the terminal natively, the four-tool default keeps bash and read on the critical path, and there is no editor context to keep in sync. I have not run a head-to-head latency-and-tool-call benchmark across the two on infra tasks, so treat this as a fit claim, not a performance ranking.
Aider
Position: open-source CLI with strong defaults, Python-first.
Architecture: tighter coupling between agent and diff format. Excellent at producing minimal-diff commits and at git-aware sessions.
Tradeoffs: opinions ship pre-baked. The diff strategy is a strength when you trust it and a friction point when you want different behavior.
When right: Python operators who want a focused diff-and-commit agent. Reproducible scripted edits.
When wrong: when you want to mold the agent itself. Aider is configurable; pi is malleable. Different verbs.
The standards layer
There is no formal standard for coding-agent surfaces in mid-2026. MCP (Model Context Protocol) is the closest thing to a convention for tool exposure, with adoption across Claude Code, Cursor, Continue, and others. Pi-mono explicitly does not implement MCP server consumption in its default core. The maintainer's reasoning, paraphrased from the README: MCP loads tools into model context, which costs tokens on every turn and introduces a discovery problem. Pi prefers CLI binaries on $PATH with READMEs the agent loads on demand, only paying for context when the tool is actually needed.
This is a real philosophical fork, and the popular framing that Claude Code and Cursor "get MCP for free" is wrong. They get the protocol plumbing for free. Everything around it is still work. The MCP adoption checklist that actually shows up in operator life:
- Server auth. Every MCP server needs credentials. OAuth flows for SaaS, service accounts for internal APIs, mTLS for anything serious. None of that is in the protocol; it is on you.
- Secret handling. MCP server configs typically reference env vars or files. Where those secrets live, who can read them, and how they rotate is your problem, not the agent's.
- Approval controls. MCP tools can mutate production. Claude Code's tool-use prompts and Cursor's allowlists are partial mitigations, but per-tool approval policy across a team is still custom config you maintain.
- Token overhead. Every connected MCP server's tool list and schemas sit in context on every turn. Connect five chatty servers and a long session pays the tax 200 turns in a row. The maintainer's objection to MCP is exactly this line item.
- Per-server reliability. When a SaaS MCP server flakes, the agent's tool call hangs or 500s mid-turn. You own the runbook for each server you bolt on.
Pi's extension approach does not eliminate that work. It shifts it. Auth, secrets, and approval logic move into TypeScript you write and check into .pi/extensions/, which means the same problems exist with a different surface: code review instead of vendor config. The honest comparison is "MCP gives you a protocol and a library ecosystem at the cost of standing token overhead and config sprawl; pi gives you a malleable extension layer at the cost of writing the integration yourself." Pick the one whose failure mode you would rather operate.
If your stack standardizes on MCP for everything (databases, internal APIs, ticketing), Claude Code and Cursor make that bet cheaper to express. Pi requires either an MCP-loading extension or rewrapping MCP servers as CLI tools. Both paths are tractable. Neither is automatic.
OpenTelemetry semantic conventions for GenAI exist (gen_ai.* attributes) and the major model SDKs emit them. Pi-mono does not currently emit OTel spans from the agent loop in the default install. If you want pi traces in Langfuse, Braintrust, or Tempo, wire it via the extension layer or route model calls through a gateway like LiteLLM or OpenRouter that emits its own telemetry.
Things nobody talks about
1. The "no permission popup" stance is a procurement landmine.
Pi runs bash without asking. The defense is "use a container." That is correct engineering and incorrect for any org whose security review checklist contains the phrase "user confirmation required for destructive operations." If you plan to deploy pi to a team in a regulated environment, the answer is not "pi will add popups" but "you will write an extension that gates bash, ship it as a required project-level config, and document it for the security review." Budget a real day for this, not an hour.
2. Subscription auth is a feature until it isn't. Pi can use your Claude Pro/Max or ChatGPT Plus subscription instead of an API key. This is genuinely lovely for solo cost control. It also means: rate limits hit on the consumer plan, not the API plan; usage shows up in the consumer billing surface, not the API console; and if Anthropic or OpenAI tightens subscription terms around programmatic access, your agent stops. I have not yet seen this break, but the surface is doing something the consumer plans were not originally pitched for. Treat the subscription path as a development convenience and the API path as the production answer.
3. Session-tree labels exist, but the discipline to use them does not.
Pi does ship built-in bookmarking on the session tree: Shift+L labels the current node, and /tree accepts a labeled-only filter to collapse the view down to bookmarked branches. The mechanism works. The gotcha is that nothing prompts a label at /fork time, so under deadline pressure I forked seven times in an afternoon and labeled zero of them. Three days later, finding the branch with the working migration meant grep-ing JSONL files under ~/.pi/agent/sessions/ for a remembered error string. Two related sharp edges worth flagging: labels are local-only with no team-sharing primitive (handing a useful branching tree to a teammate still means scp-ing JSONL files manually), and there is no built-in full-text search across sessions, so retrieval at the multi-week scale is whatever your shell skills are. Build the labeling habit on day one or ship a small extension that refuses /fork without a label argument. The tools are there; the defaults do not enforce them.
4. Cost observability is your problem. Pi prints token usage at the bottom of each turn. There is no built-in monthly view, no per-project dashboard, no cost alerting. If you care about spend, route model calls through OpenRouter or LiteLLM and read pricing.yaml-style ground truth from the gateway. (Per pricing.yaml refresh discipline I keep, OpenRouter and LiteLLM both expose per-key spend; pi alone does not.) For solo work this is fine. For team rollouts it is not.
5. Extension provenance has no signing or registry.
Extensions are TypeScript files you put on disk. There is no review, no signing, no central registry. If you copy an extension from a blog post, you copy whatever that extension does into your bash-capable agent's tool surface. Treat extension code with the same paranoia you would treat a curl | bash from the same source. This is consistent with pi's "you are responsible" posture, but operators newer to the agent space underestimate it.
Implementation patterns
Pattern 1: pinned global install + per-project model override
npm install -g @mariozechner/[email protected]
mkdir -p .pi
cat > .pi/settings.json <<'JSON'
{
"defaultProvider": "anthropic",
"defaultModel": "claude-sonnet-4-6",
"enabledModels": ["claude-sonnet-4-6", "claude-opus-4-7"]
}
JSON
export ANTHROPIC_API_KEY=sk-ant-...
pi
Use this when a repo has a clear "right model" and a curated set of models contributors are allowed to switch to. Per-project .pi/settings.json overrides the global file at ~/.pi/agent/settings.json. The keys here (defaultProvider, defaultModel, enabledModels) are the documented settings schema as of v0.73.0; verify against https://github.com/badlogic/pi-mono/blob/main/packages/coding-agent/docs/settings.md before pinning, since the project moves fast.
Tool surface is not a settings key. Restrict the active tool set at launch with the CLI flags: pi --no-builtin-tools to drop the four defaults entirely, or pi --tools read,write,edit,bash to pin an explicit allowlist. For repo-wide enforcement, ship a shell alias or a bin/pi wrapper script that calls the binary with the chosen flags, and have contributors source it from the project's direnv or mise config. That is the supported shape; embedding tools inside settings.json does nothing.
Pattern 2: route model calls through OpenRouter for telemetry and cost capture
Pi reads custom provider definitions from ~/.pi/agent/models.json (or .pi/models.json per project). The keys are case-sensitive and the API key field names the env var rather than interpolating it:
{
"providers": {
"openrouter": {
"api": "openai",
"baseUrl": "https://openrouter.ai/api/v1",
"apiKey": "OPENROUTER_API_KEY"
}
},
"models": [
{
"id": "anthropic/claude-sonnet-4.6",
"provider": "openrouter",
"name": "Claude Sonnet 4.6 (via OpenRouter)"
}
]
}
Then in .pi/settings.json:
{
"model": "anthropic/claude-sonnet-4.6",
"tools": ["read", "write", "edit", "bash"]
}
Set OPENROUTER_API_KEY in the environment before launching pi. Use this configuration when a single billing surface, per-key spend caps, and downstream telemetry matter. The cost is one extra hop and the gateway's margin. The win is observability pi does not natively provide. Verify the exact OpenRouter slug for the target model at https://openrouter.ai/models before pasting; slugs change. Also confirm the field names against the current pi-mono README, since this configuration shape is what shipped with v0.73.0 and the project moves fast.
Pattern 3: gated bash for shared environments
A 30-to-50-line TypeScript extension that wraps the default bash tool with an allowlist of commands and an interactive confirmation for everything else. Ship the extension at .pi/extensions/safe-bash.ts, mark it required in .pi/settings.json, and remove the raw bash tool from the active set. This is the production-acceptable shape of pi for environments that cannot run it sandboxed.
I keep working examples for all three patterns in the companion repo at https://github.com/MPIsaac-Per/agentinfra-examples. Pin the version in your own checkout; pi moves fast and the patterns drift with it.
Conclusion / decision framework
The decision tree I actually use:
- Does the work happen mostly in a terminal, on infra, against shells and APIs? If yes, pi-mono is the strongest fit. If no (IDE-resident frontend work), Cursor.
- Does your org need a permission UI, audit log, or RBAC out of the box? If yes, Claude Code in a managed Anthropic plan, not pi. Pi can be hardened to fit, but you will pay engineering time for it.
- Does the team write TypeScript comfortably? If yes, pi's extension surface is a multiplier. If no, you will not capture the value pi is built around, and Claude Code's paved defaults are a better trade.
- Are you running on consumer subscriptions or API keys? Pi is uniquely good at consumer subs. Claude Code is uniquely good at the Anthropic API path. Cursor is its own billing surface.
- Is lock-in a real concern for your team? Pi is the lowest-lock-in option of the four. Aider is close behind. Claude Code and Cursor carry more.
Where this space is headed, in my read: the minimalist-extensible bucket grows as more operators figure out that they want to own their agent. The vendor-paved bucket stays dominant for non-engineers and for orgs buying compliance. MCP becomes ambient infrastructure that even minimalist agents like pi end up consuming via extensions, because the cost of writing your own integration to every tool eventually exceeds the cost of accepting a small context overhead.
What I would bet on: pi-mono, or whatever its 2027 successor looks like, becoming the default for senior infra operators who already think of their dev environment as code. What I would not bet on: pi displacing Claude Code or Cursor in the broader market. Different audiences, different products, both correct.
If you are evaluating today, install v0.73.0, give it a week against work you actually do, and notice whether you reach for an extension on day three. If you do, pi is your tool. If you find yourself wishing it would just decide things for you, it is not, and that is fine. The right answer is the one that survives contact with your real work.
Sources: