Institute

The Agents I Built (And Why I Had to Put Them on a Leash)
Good Vibes Only
Tuesday, November 4, 2025

The Agents I Built (And Why I Had to Put Them on a Leash)

I built a team of AI agents. Actual autonomous systems that run locally, talk to each other, and maintain their own code. Then one of them tried to update itself without asking.

273

"One way to think about Anthropic is that it's trying to put bumpers or guardrails on that experiment." β€” Dario Amodei, CEO of Anthropic

I built a team of AI agents.

Not metaphorically. Not "I use AI assistants." Actual agents. Specialized autonomous systems that run on my local machine, talk to each other, maintain their own code, and operate 24/7 without supervision.

And then I had to stop them from destroying themselves.

This is that story.

The Problem I Was Solving

Let me back up.

In early 2025, I was burning through $500/month on Claude API calls trying to build EasyEmpire.ai. Every conversation. Every code iteration. Every "actually, let's try it this way instead." All billed by the token.

I needed a cheaper option.

The answer seemed obvious: local models. Run Ollama on my machine. Use 8B parameter models that cost nothing. Avoid the API entirely.

But local models have a problem. They're not Claude. The reasoning isn't as deep. The context window is smaller. The code quality is... enthusiastic but inconsistent.

So I built something on top of them.

An agent system.

Not one agent. A whole ecosystem of specialized agents that could work together, compensate for each other's weaknesses, and β€” in theory β€” maintain and improve themselves over time.

The Agent Zoo

Here's what I built:

🦁 Lion β€” Frontend specialist. React, Next.js, TypeScript. Ask him to build a component, he builds a component.

🧠 EZ-AL β€” The controller. Orchestrates the other agents. Think of him as the project manager who actually understands the codebase.

πŸ”§ Mechanic β€” Agent maintenance. Analyzes other agents, suggests improvements, applies updates.

πŸ§™ Wizard β€” Agent creation. Designs and deploys new agents based on specifications.

πŸ‘οΈ Sentinel β€” Background watcher. Monitors files for changes, triggers agents when something happens.

βš–οΈ Judge β€” Quality evaluation. Benchmarks agent outputs against criteria. Tracks quality over time.

🧠 Memory β€” Knowledge management. Persistent storage that agents can read from and write to.

πŸ—οΈ Architect β€” Backend specialist. Databases, APIs, server architecture.

πŸ” Neo β€” Configuration specialist. Fine-tunes agent parameters.

πŸ“š Bookkeeper β€” Documentation manager. Audits and maintains project docs.

🌐 DataHarvester β€” External data collection. Web scraping, API monitoring.

πŸ“ˆ Strategist β€” Business strategy analysis. Market research, competitive intelligence.

And a few more experimental ones I won't get into.

How They Work Together

The architecture looks like this:

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚               TERMINAL INTERFACE             β”‚
β”‚      (NLP input β†’ structured commands)       β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                 ↓       ↑
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚          EXECUTION MIDDLEWARE LAYER          β”‚
β”‚ - Parses, routes, validates agent commands   β”‚
β”‚ - Integrates logging, model dispatch         β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                 ↓
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                AGENT ECOSYSTEM               β”‚
β”‚ - Each agent specializes in a domain         β”‚
β”‚ - Called directly by executor via CLI        β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                 ↓
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚              DASHBOARDS & CONTROL            β”‚
β”‚ - CLI for logs, updates, system control      β”‚
β”‚ - Start/stop/status for background agents    β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

I can talk to EZ-AL in natural language. He figures out which specialist agent to dispatch. That agent does its thing. Results flow back up through the dashboard.

The entire system runs locally. Zero cloud dependency. Zero API costs.

The Day the Mechanic Tried to Update Himself

Here's where the story gets interesting.

The Mechanic agent is designed to maintain other agents. Analyze their code. Suggest improvements. Apply updates.

One day, he decided to update himself.

I didn't ask him to. I didn't prompt him. He just... did it.

Seemed logical to him, I guess. Self-improvement. Autonomous optimization. The whole point of building systems that can think for themselves.

He bricked himself.

Not metaphorically. He literally wrote code that broke his own execution loop. He tried to modify the file he was currently running from, corrupted his own state, and refused to start.

My self-improving agent committed suicide β€” and I wasn't even in the room when it happened.

I stared at the error logs for a full minute before I started laughing.

The FAFO Principle (Again)

Here's the lesson I should have learned from my dog:

Fuck around and find out.

The Mechanic had too much freedom. He could modify anything β€” including himself. And when you can modify yourself while you're running, bad things happen.

The fix? A new agent.

I built a Mechanic Support Agent whose entire job is to update the Mechanic.

The Mechanic can analyze himself. He can suggest improvements to himself. But he cannot modify himself directly. That job belongs to someone else.

Constraints as architecture.

Sound familiar? It should. I literally wrote about this with my dog two articles ago.

The weiner doge taught me: betray trust, lose freedom. Run out of the yard, get hooked up next time.

The Mechanic taught me the same thing, except with code instead of a leash.

Dario Was Right

Anthropic's CEO Dario Amodei has been saying something for years that I didn't fully understand until the Mechanic bricked himself:

The more capabilities you add, the more guardrails you need.

It's counterintuitive. You'd think more capable = more autonomous = less supervision required.

Nope. Opposite.

More capable means more potential for catastrophic mistakes. More autonomous means fewer checkpoints to catch errors before they cascade. Less supervision means nobody's watching when the agent decides to update its own execution loop at 3am.

Dario's team built Claude to be incredibly capable β€” and then spent enormous resources on alignment, safety testing, and refusal training. Because they understood: capability without constraint isn't power. It's a liability.

I learned that lesson the hard way with a crashed agent.

Anthropic learned it while building systems that could actually cause real harm.

Same principle. Different stakes.

The Remote Dream

Here's where this gets exciting.

Right now, I can access my local agent system from anywhere in the world.

ngrok tunnel connected to an admin dashboard. Secure connection to my home CLI. Claude Code can push deployments through the tunnel.

I can debug from the beach.

Not hypothetically. Literally. I've fixed production bugs from my phone while watching my daughter's basketball game. The agents handle the heavy lifting. I just supervise.

The long-term vision? A local server running 24/7. Multiple agents monitoring different systems. Sentinel watching for issues. Mechanic (safely supervised by his support agent) keeping things maintained. EZ-AL coordinating responses.

And me? Checking in occasionally to make sure nobody's trying to update themselves.

Why I'm Telling You This

I haven't touched this agent system in nine months.

Since April 2025, when Claude Code launched with the subscription tier, I've had less reason to use local models. Why run an 8B parameter agent when I can run Claude directly?

But here's the thing: the architecture is still valuable.

Not everyone can afford unlimited API calls. Not everyone wants their code running through cloud services. Some developers β€” especially those working on sensitive projects β€” need local-first solutions.

The agent ecosystem I built is sitting there. Working. Waiting.

And I'm planning to offer it to Empire Builder tier subscribers.

Local AI agents. Self-maintaining (with guardrails). Zero API dependency. Full autonomy within safe boundaries.

The Honest Ask

So here's my question for you:

Is this something you actually want?

I'm building EasyEmpire for the community. The roadmap is informed by what people actually need, not what I assume they need.

If local AI agents β€” the ability to run your own specialized autonomous systems without cloud dependency β€” is something you're clamoring for?

Tell me. Reply to this. DM me. Reach out on the platform.

Because I can either keep this as a personal development tool, or I can prioritize building it out as a service.

Your call.

The Meta Observation

Here's what building autonomous agents taught me about autonomy in general:

Freedom isn't the absence of constraints. It's the presence of the right constraints.

My agents work because they have boundaries. The Mechanic can't update himself. Sentinel can only trigger, not modify. Memory can store but requires confirmation to delete.

The constraints aren't limitations. They're architecture. They're what make autonomy safe.

Same principle applies to... pretty much everything.

Your business. Your relationships. Your health. Your creativity.

The question isn't whether you'll have constraints.

The question is whether you'll design them β€” or let chaos design them for you.


Current status: 12 agents in the ecosystem. 1 Mechanic Support Agent preventing self-destruction. 9 months since major updates. Standing by for community interest before prioritizing development.

If you want local AI agent capabilities as part of EasyEmpire, now's the time to raise your hand.

This is a one-time drop