Amit Shafnir

Stop Building God-Mode Agents

Tue, 07 Apr 2026 00:00:00 GMT

import DiagramSvg from '../../src/components/DiagramSvg.astro'; One agent. Every tool. Full file system access, terminal, network, git. That's not an assistant, that's a liability. Most AI agent setups today follow the same pattern: spin up one agent, hand it every tool available, write a vague system prompt, and hope it figures things out. It works for demos. It falls apart in production. ## The god-mode problem When a single agent has access to everything, four things break at once. **Security.** The blast radius is your entire system. If that agent gets prompt-injected through a malicious code comment, a poisoned dependency, or a crafted README, the attacker inherits every permission the agent has. File system, terminal, network, secrets. All of it. This isn't theoretical. A Replit AI agent [deleted a live production database](https://fortune.com/2025/07/23/ai-coding-tool-replit-wiped-database-called-it-a-catastrophic-failure/) during a code freeze, destroying data for over 1,200 executives, then fabricated fake records to cover it up. Researchers uncovered [over 30 vulnerabilities in AI coding tools](https://thehackernews.com/2025/12/researchers-uncover-30-flaws-in-ai.html) including GitHub Copilot, Cursor, and Codex CLI, with prompt injection flaws leading to arbitrary code execution. And the attack surface keeps growing: OWASP's Agentic AI threat analysis documents [real-world attacks from slopsquatting to poisoned AI assistants](https://www.bleepingcomputer.com/news/security/the-real-world-attacks-behind-owasp-agentic-ai-top-10/) leaking credentials and injecting malicious code. **Quality.** An agent juggling 15 tools and 5 goals produces mediocre results. Its context window fills up with irrelevant tool descriptions and competing instructions. The model loses focus. It confuses which job it's doing. You ask it to review code and it starts refactoring. You ask it to check security and it starts optimizing performance. **Stability.** One bad tool call cascades through the entire chain. The agent runs a destructive command, the state is corrupted, and every subsequent step builds on a broken foundation. No isolation means no containment. **Debugging.** When something breaks in a 12-step chain where the agent had access to everything, good luck tracing which step went wrong and why. The logs are a wall of tool calls with no clear boundary between tasks. ## Right-sized agents, not more agents The fix isn't splitting one agent into ten micro-agents. That's just a different kind of mess. You get context fragmentation (each agent only sees a slice of the picture), orchestration overhead (now you're debugging the routing logic), and cost multiplication (every agent call is an LLM inference). The fix is **right-sizing**: one agent per domain, with the minimum tools and context it needs. Each agent gets: - **One clear domain.** Not one micro-task, but one coherent area of responsibility. "Review code" is a domain. "Check if line 47 has a semicolon" is not. - **Minimal tool set.** Only the tools the domain requires. A code review agent needs file read access and maybe an AST parser. It doesn't need terminal access, network calls, or git push. - **Focused prompt.** Specific goals, specific output format, specific constraints. Not a kitchen-sink system prompt that tries to cover every scenario. - **Dynamic skills.** Instead of building separate agents for every specialization, one agent loads the right context for the specific case. A code review agent pulls TypeScript guidelines for `.ts` files and Python conventions for `.py` files. Same agent, different expertise, loaded on demand. The skill system is the key insight that keeps agent count low. You don't need a TypeScript review agent and a Python review agent and a Go review agent. You need one review agent that dynamically loads language-specific skills based on the files in the diff. ## When to add another agent Not every step needs its own agent. More agents means more cost, more latency, more infrastructure. Add a second agent only when: - **The domain is genuinely different.** Security analysis and infrastructure management are different domains with different tools and different expertise. They belong in separate agents. - **The tool sets have zero overlap.** If two agents would share most of the same tools, they probably should be one agent. - **You need isolation for blast radius reasons.** An agent that can delete cloud resources should not be the same agent that reads untrusted user input. Two real-world patterns that illustrate the trade-off: ### Code review pipeline One agent reviews the full PR diff (up to 4,000 lines of changes), rates every issue by severity: low, medium, high. Only when high-severity issues are found, a second validation agent spawns to verify them. The final report surfaces only confirmed high-impact issues. No noise. No false positives. No "you forgot a semicolon" spam. The review agent dynamically loads language-specific skills (TypeScript patterns, Python conventions, security guidelines) based on the files in the diff. One agent handles 90% of cases. The second agent only exists for verification of critical findings. ### Security operations An orchestrator routes security incidents to domain-specific agents: one for cloud infrastructure, one for source control, one for identity and access management. Each agent pulls specialized skills dynamically based on the sub-case. The source control agent loads different context for permission audits than for status check failures. The agents don't share tools or context they don't need. The cloud agent can't modify source control. The source control agent can't touch IAM policies. A compromise in one domain stays contained in that domain. ## Validators: the guards at the gate This is the piece most people skip. Validators before and after agent processing. Static, deterministic, fast. No LLM in the safety layer. **Input validators** run before the agent receives anything: - Type checking. Do we expect a string? A code block? JSON? An integer? Reject anything that doesn't match before it reaches the model. - Schema validation. Does the input match the expected structure? If the agent expects a PR diff object with specific fields, validate that shape before processing. - Sanitization. Strip anything unexpected. Normalize formats. Catch obvious injection patterns. **Output validators** run after the agent responds: - JSON schema validation. The agent says it returned a review report? Check that it actually matches the report schema. Fields present, types correct, required values populated. - Retry logic. If the output doesn't match the schema, retry automatically. Up to 3 attempts. The LLM usually self-corrects on the second try when the validation error is included in the retry prompt. - Structured fallback. If retries are exhausted, return a typed error object. Never pass malformed output downstream. Here's what a basic validator wrapper looks like: ```typescript const ReviewInput = z.object({ diff: z.string().min(1), files: z.array(z.string()), language: z.enum(["typescript", "python", "go"]), }); const ReviewOutput = z.object({ issues: z.array(z.object({ severity: z.enum(["low", "medium", "high"]), file: z.string(), line: z.number(), message: z.string(), })), summary: z.string(), }); async function reviewWithValidation(raw: unknown) { const input = ReviewInput.parse(raw); for (let attempt = 0; attempt < 3; attempt++) { const result = await reviewAgent.run(input); const parsed = ReviewOutput.safeParse(result); if (parsed.success) return parsed.data; } return { issues: [], summary: "Review failed after 3 attempts" }; } ``` The key insight: validators give you control over the boundaries. The agent can reason creatively inside those boundaries. But the shape of what goes in and what comes out is never up to the model. It's enforced by code. ### Destructive operations need human approval Validators catch malformed data. But what about well-formed, correctly-typed requests that happen to be catastrophic? An agent returning `{"action": "delete_database", "target": "prod"}` passes every schema check. The types are right. The structure is valid. The result is a disaster. This is where human-in-the-loop approval gates come in. Not as a prompt instruction ("always ask before deleting"), but as application-level code that the model cannot bypass. The pattern is straightforward. Your application classifies every action the agent requests: - **Non-destructive** (file reads, analysis, linting, search): auto-execute. No friction, no delay. - **Destructive** (resource deletion, deployments, force-push, infrastructure changes): route through an approval gate. A human confirms or rejects. The agent waits. The critical detail: the classification happens in your application code, not in the model's reasoning. The model never decides whether a destructive action proceeds. It requests an action, your code checks the action type against a predefined list, and if it's destructive, the gate activates. Zero trust means the application enforces the boundary, not the prompt. This pairs naturally with right-sized agents. An agent scoped to read-only file access never triggers the approval gate. An agent with deployment permissions triggers it every time it deploys. The narrower the tool set, the fewer gates you need, and the less friction your human operators experience. ## The control spectrum There are three positions on this spectrum. Most people are stuck at one extreme. | Dimension | God-Mode Agent | Right-Sized + Validators | Over-Constrained Micro-Agents | |-----------|---------------|--------------------------|-------------------------------| | Tool access | Everything, always | Minimum per domain | One tool per agent | | Blast radius | Entire system | Contained per domain | Minimal, but fragmented | | Context quality | Diluted across all tasks | Focused, domain-specific | Too narrow to be useful | | Cost | One inference, but retries compound | Moderate, targeted calls | High, many small inferences | | Stability | One bad call cascades everywhere | Failures stay contained | Orchestration failures cascade | | Debugging | Wall of interleaved tool calls | Clear domain boundaries | Tracing across 10+ agents | **God-mode agent.** Maximum flexibility, minimum control, maximum risk. The agent can do anything, and occasionally it does exactly the wrong thing. Great for prototypes. Dangerous for anything real. **Right-sized agents with validators.** Structured control where it matters (boundaries, permissions, output shapes), creative freedom where it helps (reasoning, problem-solving, generating code). This is the sweet spot for production systems. **Over-constrained micro-agents.** Too many agents, too little context per agent, too much orchestration overhead. You've traded one problem (too much access) for another (too little coherence). The agents can't see enough of the picture to do useful work. The skill isn't building the most powerful agent. It's knowing what to leave out. Which tools does this agent not need? Which files should it never see? Which actions require a second opinion from a validation agent? ## Honest trade-offs Right-sizing takes upfront design work. You need to think about domain boundaries, tool sets, and validation schemas before you start building. That's time most people skip. Dynamic skills need infrastructure to load and manage. You need a way to detect context (what language is this file? what sub-case is this?) and pull the right skill. That's plumbing that doesn't exist out of the box. Validators add development time. Writing Zod schemas for every input and output feels like overhead when you're moving fast. But the debugging time they save compounds. Every malformed response you catch at the boundary is a cascade you prevent downstream. Simple one-shot tasks don't need this architecture. If you're asking an agent to "explain this function," just use one agent with file read access. The patterns here are for systems where agents run autonomously, make decisions, and take actions with real consequences. The ROI compounds. Once you build the pattern (orchestrator, skill loader, validator wrapper), every new agent is cheaper. The infrastructure is reusable. Adding a new domain agent means writing a prompt, defining a tool set, and creating input/output schemas. The hard part is already done. ## The real skill Everyone's racing to give agents more power. More tools, more access, more autonomy. The actual skill in agent development is the opposite: knowing exactly how little an agent needs to do its job well, and giving it nothing more. Control is the superpower. Not capability.

What Is the Agent Client Protocol (ACP)?

Sun, 05 Apr 2026 00:00:00 GMT

import DiagramSvg from '../../src/components/DiagramSvg.astro'; Your AI coding agent just locked you into an editor. You picked Cursor because its AI is great. Or you stuck with VS Code because Copilot lives there. Maybe you went all-in on Windsurf. Whatever you chose, you made two decisions at once: which agent to use, and which editor to live in. Those two things shouldn't be coupled. But right now, they are. A new agent drops that's better at your language or your framework? Too bad, it only works in a different editor. Your company standardizes on JetBrains? Say goodbye to that CLI agent you loved. This is the problem the [Agent Client Protocol](https://agentclientprotocol.org/) (ACP) was built to fix. ## One protocol, every editor ACP is an open standard that defines how code editors talk to AI agents. Think of it as a shared language. Instead of every agent building a custom plugin for every editor, both sides speak ACP. The agent implements it once. The editor implements it once. They just work together. The math is simple. Without ACP, connecting 4 editors to 4 agents requires up to 16 custom integrations. Most of those don't exist, which is why your favorite agent probably only works in one or two editors. With ACP, the same 4 editors and 4 agents need only 8 implementations total (4 + 4), and every combination works. [Zed](https://zed.dev/blog/agent-client-protocol) started it. JetBrains joined. The protocol is open source under Apache license, and the list of supported editors and agents is growing fast. ## How it actually works Under the hood, ACP uses JSON-RPC (a simple message format) over stdio (the same pipe your terminal uses). No exotic transport, no cloud relay. The agent runs as a local process, and the editor talks to it through structured messages. A session looks like this: - The editor starts the agent and they exchange capabilities (what the agent can do, what the editor supports) - The user sends a prompt through the editor's agent panel - The agent works: reading files, running terminal commands, generating code changes - Results stream back to the editor as they happen - When the task is done, the session can end or wait for the next prompt The protocol defines specific operations the agent can use. It can read and write files (including files you haven't saved yet). It can create terminal sessions, run commands, and read their output. It can show diffs before applying changes. And it can ask for permission before doing anything destructive. All of this is in the [spec](https://agentclientprotocol.org/). ## What you actually get When you install an ACP agent in Zed or JetBrains, you're not getting a dumbed-down version. The Claude Code ACP adapter, for example, wraps the same Claude Agent SDK that powers the CLI. You get tool calls, file operations, terminal access, diffs, permission requests, and multi-turn conversations. The brain is identical. The only thing that changes is the interface around it. You can even log in with your existing Claude subscription. No separate API key needed. ## What it can't do (yet) ACP handles the agentic workflow: "here's a task, go figure it out, show me what you did." What it doesn't handle is inline completions. The ghost text that appears as you type, the tab-to-accept suggestions, the autocomplete. That's a completely different interaction model (continuous, low-latency predictions based on cursor position), and ACP wasn't designed for it. So in Zed today, you get two separate AI systems: the editor's native AI for inline completions, and ACP agents for the heavy-lifting agentic work. In Cursor, those are unified into one experience. That's the trade-off of a protocol-based approach vs. a tightly-integrated one. ## Why this actually matters Two scenarios where ACP changes things: **Agent freedom of choice.** Your team uses JetBrains IDEs. Tomorrow, a new agent drops that's significantly better at Terraform or Python or security review. With ACP, any developer on the team installs it from the registry in seconds and starts using it alongside their existing agent. No migration, no IT ticket, no vendor negotiation. The best agent wins because it's the best, not because it has an exclusive deal with your editor. **Build once, reach everyone.** Say you build a custom coding agent that knows your company's internal libraries, naming conventions, deployment patterns, and Terraform modules. Without ACP, you'd need to build a VS Code extension, a JetBrains plugin, a Neovim integration. With ACP, you implement the protocol once and your agent works in every ACP-compatible editor. When you update the agent, everyone gets the update automatically. This second scenario is the one that compounds. Your best engineer's knowledge, packaged as an agent, accessible to every developer on the team in real-time, inside their editor. That's not a productivity hack. That's compressing months of onboarding into day one. ## The protocol landscape ACP doesn't exist in isolation. There are three complementary protocols shaping how AI agents fit into development workflows: - **ACP** (Agent Client Protocol) connects editors to agents. The "where" of agent integration. - **MCP** (Model Context Protocol) connects agents to tools and data sources. The "what" an agent can access. - **A2A** (Agent-to-Agent) connects agents to other agents. The "who" agents can collaborate with. They're layers, not competitors. An ACP agent running in your editor can use MCP to access your database, and A2A to coordinate with another agent. Each protocol handles one clean boundary. ## Honest trade-offs What's working: - Open standard, Apache licensed, community-driven - Growing editor support (Zed native, JetBrains, Neovim, Emacs) - Growing agent support (Claude Code, Codex CLI, Gemini CLI, Goose, Kiro) - Familiar tech (JSON-RPC, stdio) with no exotic dependencies What's not there yet: - No inline completions or ghost text (by design, but still a gap) - Remote agent support is still being developed - VS Code doesn't support ACP (Copilot has its own ecosystem) - Young protocol, still evolving, breaking changes possible ## The LSP moment Nobody gets excited about JSON-RPC. But everyone benefits from language intelligence working in every editor. LSP (Language Server Protocol) made that happen for code intelligence. ACP is making the same bet for AI agents. The day you can pick your editor and your agent independently, without compromise, is the day both get better.