Field notes

Governed AI systems, from agent research to production work.

A working corpus for founders, operators, and technical leaders building production AI agents. These notes cover governance, operator-grade AI coding, and the controls needed to put AI into real workflows.

Talk through advisory work See the research

Agent-infrastructure
Enterprise Software
The Agent Harness Is The Product
The model is only one part of an AI product. The harness around it determines whether the system is reliable, governable, auditable, and usable in enterprise settings.
Michael IsaacMay 12, 2026
Agent-infrastructure
AI Governance
Enterprise Software
ISO-Ready Is Not A Badge. It Is An Engineering Posture.
ISO-ready software is not the same as certified software. It is software built with owners, evidence, controls, gaps, and audit discipline from the beginning.
Michael IsaacMay 11, 2026
Agent-infrastructure
AI Governance
Enterprise Software
Sub-Processor Control Is The Hidden AI Agent Runtime Problem
Model routing is not only an engineering convenience. In enterprise AI systems, provider choice becomes a governance and procurement control.
Michael IsaacMay 10, 2026
Agent-infrastructure
AI Governance
Enterprise Software
AI Coding Procurement Evidence Is The Metric That Matters
AI coding should not only be measured by task completion. The stronger metric is how quickly a feature becomes secure, explainable, and procurement-ready.
Michael IsaacMay 9, 2026
Agent-infrastructure
AI Governance
Enterprise Software
Enterprise Procurement Is The Real AI Coding Benchmark
The hard test for AI-augmented coding is not whether an agent can generate code. It is whether the resulting system can answer security, privacy, AI governance, audit, and procurement questions without falling apart.
Michael IsaacMay 8, 2026
Agent-infrastructure
Agentic-coding-research
Agentic Search Is Major, Not Half
Entire found search at 48.8% of public coding-agent tool calls. My Claude Code corpus lands at 30.4%-37.0% depending Bash-search definition, which points to the same protocol with a narrower headline.
Michael IsaacMay 7, 2026
Agent-infrastructure
Agentic-coding-research
Codex Telemetry Shows the Agent Is a Runtime, Not a Chat Window
After filtering 95% stream-loop noise from 1.18M raw Codex spans, gpt-5.5 sessions show model wait at ~10x tool execution per turn, while gpt-5.4 is closer to balanced. The operator lever is context discipline, not raw tool speed.
Michael IsaacMay 7, 2026
Agent-infrastructure
Agentic-coding-research
Stuckness Is Where Agentic Coding Gets Expensive
In my Claude Code corpus, sessions with 10 or more explicit tool errors represented 12.1% of token-bearing sessions but 62.6% of token volume. The lesson is not avoid errors. It is interrupt stuckness.
Michael IsaacMay 7, 2026
Agent-infrastructure
Agentic-coding-research
The Agent Loop Map: Claude Code Is Mostly Search, Edit, Shell, Repeat
In 247,592 Claude Code tool events, 81.7% of tool-family transitions stayed inside the search/edit/shell core. The leverage is not just prompt engineering. It is loop-shaping.
Michael IsaacMay 7, 2026
Agent-infrastructure
Agentic-coding-research
Autonomy Has a Half-Life: What 247,592 Tool Calls Say About Claude Code Checkpoints
Real agentic coding is not infinite autonomy. In my Claude Code corpus, nonzero agent runs between human messages had a median of 5 tools and a p90 of 25. That points to a concrete checkpoint protocol.
Michael IsaacMay 7, 2026
Agent-infrastructure
Agentic-coding-research
Claude Code Verification Debt: The Agent Said Done, But Where Are the Receipts?
In 450,878 Claude Code assistant turns, 90.2% of completion-claim turns landed in a same-turn unverified candidate bucket. That does not mean the work was wrong. It means operators need receipts.
Michael IsaacMay 7, 2026
Agent-infrastructure
Claude Code SDK: What Operators Actually Need to Know (2026)
Operator-grade deep dive on the Claude Code SDK: architecture, tradeoffs, hidden costs, and when to pick it over LangGraph or rolling your own.
Michael IsaacMay 7, 2026
Agent-infrastructure
Claude Code CLI: What Operators Actually Need to Know (2026)
An operator's deep dive on Claude Code CLI — architecture, costs, agent SDK tradeoffs, and the production gotchas vendor docs don't cover.
Michael IsaacMay 7, 2026
Agent-infrastructure
Cursor AI: What Operators Actually Need to Know (2026)
An operator's deep dive on Cursor AI for production engineering teams: architecture, agent mode tradeoffs, lock-in risk, and the gotchas vendor pages skip.
Michael IsaacMay 7, 2026
Agent-infrastructure
Langfuse for Production AI Agents: What Operators Actually Need to Know (2026)
Deep dive on Langfuse: architecture, hosted vs self-hosted tradeoffs, OpenTelemetry posture, cost at scale, and the operational gotchas vendor pages skip.
Michael IsaacMay 7, 2026
Agent-infrastructure
LiteLLM in Production: Architecture, Tradeoffs, and Operational Reality (2026)
An operator-grade teardown of LiteLLM as a production gateway for AI agents: architecture, real failure modes, costs, and when to pick something else.
Michael IsaacMay 7, 2026
Agent-infrastructure
OpenClaw Deep Dive: Self-Hosted Multi-Channel Agent Gateway (2026)
OpenClaw runs a local agent gateway that bridges WhatsApp, Slack, iMessage, and 20+ other channels. Honest take on architecture, tradeoffs, and operational gotchas.
Michael IsaacMay 6, 2026
Agent-infrastructure
How to Use Claude Code (2026 Tutorial for Engineers)
A working engineer's tutorial for Claude Code. Install, auth, MCP, hooks, subagents, and the operational gotchas the docs skip.
Michael IsaacMay 6, 2026
Agent-infrastructure
What Is Claude Code? An Operator's Deep Dive (2026)
Claude Code is Anthropic's terminal-native coding agent. Here's how it actually works in production, what breaks, and when to pick it over Cursor or Aider.
Michael IsaacMay 6, 2026
Agent-infrastructure
How to install Claude Code on macOS, Linux, and Windows (2026)
Step-by-step install for Claude Code in 2026. Covers npm, native installer, WSL, auth, the gotchas the docs skip, and what breaks at scale.
Michael IsaacMay 6, 2026
Agent-infrastructure
Claude Code Pricing: An Operator's Invoice-Level Breakdown
First-person breakdown of Claude Code billing across Pro, Max, API, Bedrock, and Vertex, based on my own invoices and session logs.
Michael IsaacMay 6, 2026
Agent-infrastructure
Agentic-coding-research
fpk: F-Bombs Per Thousand. The Dev-Experience Metric You Didn't Know You Needed
I scanned 5 months of my own Claude Code conversation logs for f-bombs and correlated the rate with model and CLI version. The result was a surprisingly clean DX gradient, and a metric I'm only half-joking about.
Michael IsaacMay 5, 2026
Agent-infrastructure
pi-mono Deep Dive: The Minimalist Coding Agent for Operators (2026)
Operator-grade analysis of pi-mono — a 4-tool, MCP-free, aggressively-extensible coding agent. Architecture, tradeoffs, and what nobody tells you.
Michael IsaacMay 5, 2026
Agent-infrastructure
Codex vs Claude Code: Which Wins for Production Agent Work (2026)
I shipped production code with both Codex and Claude Code for six months. Honest comparison of cost, autonomy, plugin ecosystems, and the failure modes nobody mentions.
Michael IsaacMay 5, 2026
Agent-infrastructure
Claude Code Alternatives: What Operators Actually Need to Know (2026)
An operator-grade teardown of Claude Code alternatives in 2026. Cursor, Aider, Codex CLI, Cline, Windsurf, OpenHands, pi-mono. Architecture, costs, and the gotchas vendor pages skip.
Michael IsaacMay 5, 2026
Agent-infrastructure
OpenRouter Alternatives: What Operators Actually Need to Know (2026)
An operator's deep dive into OpenRouter alternatives for production AI agents: LiteLLM, AI Gateway, Portkey, direct providers, with real cost and ops tradeoffs.
Michael IsaacMay 5, 2026
Agent-infrastructure
OpenRouter Pricing: What Operators Actually Pay in 2026
An operator-grade teardown of OpenRouter's pricing model, hidden fees, and the production tradeoffs versus self-hosted gateways like LiteLLM.
Michael IsaacMay 5, 2026
Agent-infrastructure
Agentic-coding-research
What I Learned From 245,306 Claude Code Tool Calls
I analyzed 245,306 Claude Code tool calls across 113 days. The data says it's a Unix operator with an LLM loop, not a chat product with tools attached.
Michael IsaacMay 5, 2026
Agent-infrastructure
Claude Code Subagents: What Operators Actually Need to Know (2026)
An operator-grade teardown of Claude Code subagents in 2026: when separate context windows pay off, where they fail at scale, and the patterns I actually ship.
Michael IsaacMay 5, 2026
Agent-infrastructure
Claude Code Skills: What Operators Actually Need to Know (2026)
An operator-grade teardown of Claude Code Skills: the YAML+markdown trigger system, how it differs from agents and MCP, and the gotchas nobody talks about.
Michael IsaacMay 5, 2026
Agent-infrastructure
Claude Code MCP: What Operators Actually Need to Know (2026)
An operator-grade deep dive on Model Context Protocol in Claude Code: architecture, real failure modes, server tradeoffs, and the gotchas vendor docs skip.
Michael IsaacMay 5, 2026
Agent-infrastructure
Claude Code Hooks: What Operators Actually Need to Know (2026)
A senior engineer's deep dive on Claude Code hooks: event model, blocking semantics, real production patterns, and the operational gotchas vendor docs skip.
Michael IsaacMay 5, 2026
Agent-infrastructure
Claude Code vs Cursor: Which Wins for AI Agent Dev (2026)
I shipped agent infrastructure with both Claude Code and Cursor for six months. Honest verdict on which one wins for production agent dev work in 2026.
Michael IsaacMay 5, 2026
Agent-infrastructure
How to Set Up LiteLLM Proxy for Production AI Agents (2026)
A tested LiteLLM proxy setup walkthrough with Postgres-backed virtual keys, budgets, retries, and the operational gotchas the docs gloss over.
Michael IsaacApril 26, 2026
Agent-infrastructure
Agent Eval Pipelines: What Operators Actually Need to Know (2026)
Failure modes I have hit in production eval pipelines: scorer drift, trace sampling, PII redaction, and CI gates. With architecture, tradeoffs, and dated source notes.
Michael IsaacApril 26, 2026
Agent-infrastructure
Production Agent Observability: What Operators Actually Need to Know (2026)
An operator's deep dive on production AI agent observability: Langfuse, LangSmith, Braintrust, OpenTelemetry GenAI conventions, cost, lock-in, and gotchas.
Michael IsaacApril 26, 2026
Agent-infrastructure
LiteLLM vs OpenRouter: Which Wins for Production AI Agents (2026)
I ran both LiteLLM and OpenRouter in production agent stacks. Here's the honest comparison — pricing, lock-in, failure modes, and when neither is the right call.
Michael IsaacApril 26, 2026
Agent-infrastructure
Langfuse vs Braintrust: Which Wins for Agent Observability (2026)
I shipped agent observability on both Langfuse and Braintrust. Here's the honest breakdown of pricing, self-host reality, evals, and where each one cracks.
Michael IsaacApril 26, 2026

The Agent Harness Is The Product

ISO-Ready Is Not A Badge. It Is An Engineering Posture.

Sub-Processor Control Is The Hidden AI Agent Runtime Problem

AI Coding Procurement Evidence Is The Metric That Matters

Enterprise Procurement Is The Real AI Coding Benchmark

Agentic Search Is Major, Not Half

Codex Telemetry Shows the Agent Is a Runtime, Not a Chat Window

Stuckness Is Where Agentic Coding Gets Expensive

The Agent Loop Map: Claude Code Is Mostly Search, Edit, Shell, Repeat

Autonomy Has a Half-Life: What 247,592 Tool Calls Say About Claude Code Checkpoints

Claude Code Verification Debt: The Agent Said Done, But Where Are the Receipts?

Claude Code SDK: What Operators Actually Need to Know (2026)

Claude Code CLI: What Operators Actually Need to Know (2026)

Cursor AI: What Operators Actually Need to Know (2026)

Langfuse for Production AI Agents: What Operators Actually Need to Know (2026)

LiteLLM in Production: Architecture, Tradeoffs, and Operational Reality (2026)

OpenClaw Deep Dive: Self-Hosted Multi-Channel Agent Gateway (2026)

How to Use Claude Code (2026 Tutorial for Engineers)

What Is Claude Code? An Operator's Deep Dive (2026)

How to install Claude Code on macOS, Linux, and Windows (2026)

Claude Code Pricing: An Operator's Invoice-Level Breakdown

fpk: F-Bombs Per Thousand. The Dev-Experience Metric You Didn't Know You Needed

pi-mono Deep Dive: The Minimalist Coding Agent for Operators (2026)

Codex vs Claude Code: Which Wins for Production Agent Work (2026)

Claude Code Alternatives: What Operators Actually Need to Know (2026)

OpenRouter Alternatives: What Operators Actually Need to Know (2026)

OpenRouter Pricing: What Operators Actually Pay in 2026

What I Learned From 245,306 Claude Code Tool Calls

Claude Code Subagents: What Operators Actually Need to Know (2026)

Claude Code Skills: What Operators Actually Need to Know (2026)

Claude Code MCP: What Operators Actually Need to Know (2026)

Claude Code Hooks: What Operators Actually Need to Know (2026)

Claude Code vs Cursor: Which Wins for AI Agent Dev (2026)

How to Set Up LiteLLM Proxy for Production AI Agents (2026)

Agent Eval Pipelines: What Operators Actually Need to Know (2026)

Production Agent Observability: What Operators Actually Need to Know (2026)

LiteLLM vs OpenRouter: Which Wins for Production AI Agents (2026)

Langfuse vs Braintrust: Which Wins for Agent Observability (2026)