Skip to content
Back to Blog

Stop Building God-Mode Agents

Why right-sized AI agents with validators beat one monolithic agent with all the permissions. Architecture patterns for secure, high-quality agent systems.

ai-agents security architecture devsecops
Stop Building God-Mode Agents diagram
Click to expand

One agent. Every tool. Full file system access, terminal, network, git. That’s not an assistant, that’s a liability.

Most AI agent setups today follow the same pattern: spin up one agent, hand it every tool available, write a vague system prompt, and hope it figures things out. It works for demos. It falls apart in production.

The god-mode problem

When a single agent has access to everything, four things break at once.

Security. The blast radius is your entire system. If that agent gets prompt-injected through a malicious code comment, a poisoned dependency, or a crafted README, the attacker inherits every permission the agent has. File system, terminal, network, secrets. All of it.

This isn’t theoretical. A Replit AI agent deleted a live production database during a code freeze, destroying data for over 1,200 executives, then fabricated fake records to cover it up. Researchers uncovered over 30 vulnerabilities in AI coding tools including GitHub Copilot, Cursor, and Codex CLI, with prompt injection flaws leading to arbitrary code execution. And the attack surface keeps growing: OWASP’s Agentic AI threat analysis documents real-world attacks from slopsquatting to poisoned AI assistants leaking credentials and injecting malicious code.

Quality. An agent juggling 15 tools and 5 goals produces mediocre results. Its context window fills up with irrelevant tool descriptions and competing instructions. The model loses focus. It confuses which job it’s doing. You ask it to review code and it starts refactoring. You ask it to check security and it starts optimizing performance.

Stability. One bad tool call cascades through the entire chain. The agent runs a destructive command, the state is corrupted, and every subsequent step builds on a broken foundation. No isolation means no containment.

Debugging. When something breaks in a 12-step chain where the agent had access to everything, good luck tracing which step went wrong and why. The logs are a wall of tool calls with no clear boundary between tasks.

Right-sized agents, not more agents

The fix isn’t splitting one agent into ten micro-agents. That’s just a different kind of mess. You get context fragmentation (each agent only sees a slice of the picture), orchestration overhead (now you’re debugging the routing logic), and cost multiplication (every agent call is an LLM inference).

The fix is right-sizing: one agent per domain, with the minimum tools and context it needs.

Each agent gets:

  • One clear domain. Not one micro-task, but one coherent area of responsibility. “Review code” is a domain. “Check if line 47 has a semicolon” is not.
  • Minimal tool set. Only the tools the domain requires. A code review agent needs file read access and maybe an AST parser. It doesn’t need terminal access, network calls, or git push.
  • Focused prompt. Specific goals, specific output format, specific constraints. Not a kitchen-sink system prompt that tries to cover every scenario.
  • Dynamic skills. Instead of building separate agents for every specialization, one agent loads the right context for the specific case. A code review agent pulls TypeScript guidelines for .ts files and Python conventions for .py files. Same agent, different expertise, loaded on demand.

The skill system is the key insight that keeps agent count low. You don’t need a TypeScript review agent and a Python review agent and a Go review agent. You need one review agent that dynamically loads language-specific skills based on the files in the diff.

When to add another agent

Not every step needs its own agent. More agents means more cost, more latency, more infrastructure. Add a second agent only when:

  • The domain is genuinely different. Security analysis and infrastructure management are different domains with different tools and different expertise. They belong in separate agents.
  • The tool sets have zero overlap. If two agents would share most of the same tools, they probably should be one agent.
  • You need isolation for blast radius reasons. An agent that can delete cloud resources should not be the same agent that reads untrusted user input.

Two real-world patterns that illustrate the trade-off:

Code review pipeline

One agent reviews the full PR diff (up to 4,000 lines of changes), rates every issue by severity: low, medium, high. Only when high-severity issues are found, a second validation agent spawns to verify them. The final report surfaces only confirmed high-impact issues.

No noise. No false positives. No “you forgot a semicolon” spam. The review agent dynamically loads language-specific skills (TypeScript patterns, Python conventions, security guidelines) based on the files in the diff. One agent handles 90% of cases. The second agent only exists for verification of critical findings.

Security operations

An orchestrator routes security incidents to domain-specific agents: one for cloud infrastructure, one for source control, one for identity and access management. Each agent pulls specialized skills dynamically based on the sub-case. The source control agent loads different context for permission audits than for status check failures.

The agents don’t share tools or context they don’t need. The cloud agent can’t modify source control. The source control agent can’t touch IAM policies. A compromise in one domain stays contained in that domain.

Validators: the guards at the gate

This is the piece most people skip. Validators before and after agent processing. Static, deterministic, fast. No LLM in the safety layer.

Input validators run before the agent receives anything:

  • Type checking. Do we expect a string? A code block? JSON? An integer? Reject anything that doesn’t match before it reaches the model.
  • Schema validation. Does the input match the expected structure? If the agent expects a PR diff object with specific fields, validate that shape before processing.
  • Sanitization. Strip anything unexpected. Normalize formats. Catch obvious injection patterns.

Output validators run after the agent responds:

  • JSON schema validation. The agent says it returned a review report? Check that it actually matches the report schema. Fields present, types correct, required values populated.
  • Retry logic. If the output doesn’t match the schema, retry automatically. Up to 3 attempts. The LLM usually self-corrects on the second try when the validation error is included in the retry prompt.
  • Structured fallback. If retries are exhausted, return a typed error object. Never pass malformed output downstream.

Here’s what a basic validator wrapper looks like:

const ReviewInput = z.object({
  diff: z.string().min(1),
  files: z.array(z.string()),
  language: z.enum(["typescript", "python", "go"]),
});

const ReviewOutput = z.object({
  issues: z.array(z.object({
    severity: z.enum(["low", "medium", "high"]),
    file: z.string(),
    line: z.number(),
    message: z.string(),
  })),
  summary: z.string(),
});

async function reviewWithValidation(raw: unknown) {
  const input = ReviewInput.parse(raw);

  for (let attempt = 0; attempt < 3; attempt++) {
    const result = await reviewAgent.run(input);
    const parsed = ReviewOutput.safeParse(result);
    if (parsed.success) return parsed.data;
  }

  return { issues: [], summary: "Review failed after 3 attempts" };
}

The key insight: validators give you control over the boundaries. The agent can reason creatively inside those boundaries. But the shape of what goes in and what comes out is never up to the model. It’s enforced by code.

Destructive operations need human approval

Validators catch malformed data. But what about well-formed, correctly-typed requests that happen to be catastrophic? An agent returning {"action": "delete_database", "target": "prod"} passes every schema check. The types are right. The structure is valid. The result is a disaster.

This is where human-in-the-loop approval gates come in. Not as a prompt instruction (“always ask before deleting”), but as application-level code that the model cannot bypass.

The pattern is straightforward. Your application classifies every action the agent requests:

  • Non-destructive (file reads, analysis, linting, search): auto-execute. No friction, no delay.
  • Destructive (resource deletion, deployments, force-push, infrastructure changes): route through an approval gate. A human confirms or rejects. The agent waits.

The critical detail: the classification happens in your application code, not in the model’s reasoning. The model never decides whether a destructive action proceeds. It requests an action, your code checks the action type against a predefined list, and if it’s destructive, the gate activates. Zero trust means the application enforces the boundary, not the prompt.

Human-in-the-loop approval gate pattern where non-destructive actions auto-execute while destructive actions require human confirmation through an application-level gate
Click to expand

This pairs naturally with right-sized agents. An agent scoped to read-only file access never triggers the approval gate. An agent with deployment permissions triggers it every time it deploys. The narrower the tool set, the fewer gates you need, and the less friction your human operators experience.

The control spectrum

There are three positions on this spectrum. Most people are stuck at one extreme.

DimensionGod-Mode AgentRight-Sized + ValidatorsOver-Constrained Micro-Agents
Tool accessEverything, alwaysMinimum per domainOne tool per agent
Blast radiusEntire systemContained per domainMinimal, but fragmented
Context qualityDiluted across all tasksFocused, domain-specificToo narrow to be useful
CostOne inference, but retries compoundModerate, targeted callsHigh, many small inferences
StabilityOne bad call cascades everywhereFailures stay containedOrchestration failures cascade
DebuggingWall of interleaved tool callsClear domain boundariesTracing across 10+ agents

God-mode agent. Maximum flexibility, minimum control, maximum risk. The agent can do anything, and occasionally it does exactly the wrong thing. Great for prototypes. Dangerous for anything real.

Right-sized agents with validators. Structured control where it matters (boundaries, permissions, output shapes), creative freedom where it helps (reasoning, problem-solving, generating code). This is the sweet spot for production systems.

Over-constrained micro-agents. Too many agents, too little context per agent, too much orchestration overhead. You’ve traded one problem (too much access) for another (too little coherence). The agents can’t see enough of the picture to do useful work.

The skill isn’t building the most powerful agent. It’s knowing what to leave out. Which tools does this agent not need? Which files should it never see? Which actions require a second opinion from a validation agent?

Honest trade-offs

Right-sizing takes upfront design work. You need to think about domain boundaries, tool sets, and validation schemas before you start building. That’s time most people skip.

Dynamic skills need infrastructure to load and manage. You need a way to detect context (what language is this file? what sub-case is this?) and pull the right skill. That’s plumbing that doesn’t exist out of the box.

Validators add development time. Writing Zod schemas for every input and output feels like overhead when you’re moving fast. But the debugging time they save compounds. Every malformed response you catch at the boundary is a cascade you prevent downstream.

Simple one-shot tasks don’t need this architecture. If you’re asking an agent to “explain this function,” just use one agent with file read access. The patterns here are for systems where agents run autonomously, make decisions, and take actions with real consequences.

The ROI compounds. Once you build the pattern (orchestrator, skill loader, validator wrapper), every new agent is cheaper. The infrastructure is reusable. Adding a new domain agent means writing a prompt, defining a tool set, and creating input/output schemas. The hard part is already done.

The real skill

Everyone’s racing to give agents more power. More tools, more access, more autonomy. The actual skill in agent development is the opposite: knowing exactly how little an agent needs to do its job well, and giving it nothing more.

Control is the superpower. Not capability.

Common questions

How many agents is too many?

There's no magic number. Split by domain, not by task. If two agents always run on the same data, merge them. If one agent has competing goals across different domains, split it.

Do validators slow down agent responses?

Static validators (JSON schema, type checks) add negligible latency. The retry cost only hits when the agent fails validation, which is exactly when you want it to retry.

What are dynamic skills for AI agents?

Reusable context packages (guidelines, patterns, domain knowledge) that agents load based on the specific sub-case. One agent loads the right skill dynamically instead of building separate agents per specialization.

Should I always use an orchestrator?

No. Use an orchestrator when you have multiple domain agents that need coordination. For independent agents or simple pipelines, direct invocation is simpler.