Build the Tools

Most AI-assisted operations today work like this: an agent with access to a terminal runs commands, reads output, reasons about what it sees, and takes action. It queries databases directly. It reads raw logs. It correlates signals by pulling data from multiple sources and interpreting it in context. When it works, it looks impressive.

We wouldn't let a new hire do this. We'd give them purpose-built tools with guardrails, scope their access, and build confirmation steps around the things that can cause irreversible damage. We do this because operating directly on production systems with raw commands is dangerous, and the danger increases when the operator lacks context about the system they're touching.

An AI agent has less context than a new hire. It has whatever fits in its window. It doesn't carry institutional memory. It doesn't remember the last time this metric behaved this way, or what actually caused it, or why this particular service behaves differently on Tuesdays. Yet we give it broader access than we'd give a person, because it's fast and because it doesn't complain.

What the AI Can't See

The knowledge that makes experienced operators valuable is largely tacit -- personal, context-specific, and difficult to formalize (Nonaka & Takeuchi, 1995). It transfers between people through shared experience: a junior operator shadows a senior one during an incident and absorbs not just the commands but the pacing, the priorities, the instinct for what to check first. Hoffman (2008) found this knowledge isn't untransferable -- it's "inert," accessible only in specific contexts, articulable with the right scaffolding. But it requires presence and participation.

AI skips this entirely. It has access only to what was written down -- the documentation, the runbooks, the code. Everything that remained tacit because the organization never invested in surfacing it is invisible to the agent. The AI operates on the explicit layer of a system whose reliability depends on the tacit layer.

This is why AI-as-operator is weaker than it appears. It can process everything that was externalized. The things that weren't externalized -- the awareness that something looks wrong before you can articulate why, the memory of what was tried before and failed, the understanding of which correlations are meaningful and which are coincidental -- those are exactly what make experienced operators load-bearing. Giving AI broad terminal access and asking it to investigate production issues puts it in the wrong role. Running commands against production without guardrails is dangerous regardless of who or what is doing it. We solved this for humans by building tools. The answer for AI is the same.

Working Through AI

The right role for AI in operations isn't faster operator. It's the layer that captures knowledge, enforces guardrails, and builds the interfaces that make dangerous operations safe.

When a human investigates with raw commands, the knowledge of what they checked and why lives in their head. When they leave the team, it leaves with them. When an AI investigates in its default mode -- session starts, commands run, answer produced, session ends -- the knowledge disappears with the context window. Neither produces anything durable by default.

But AI doesn't have to work this way. When you route all operations through AI rather than running commands yourself, the AI becomes the note-taker for everything that happens. It captures what was checked, in what order, what the output meant, and why a particular action was taken. From there it can store that context for the next session, build a tool that encodes the process, produce documentation, prepare a handoff -- whatever makes the knowledge persist. This is the real shift: not "let AI do the work" but "never let work happen without AI capturing it." Every operation becomes an opportunity to externalize knowledge that would otherwise stay ephemeral.

The experienced operator provides the knowledge of what matters and why. AI provides the implementation and the memory. In Nonaka and Takeuchi's terms, this is externalization -- converting tacit operational knowledge into an explicit, reusable form. The operator who knows which fifteen sources matter and why certain correlations are meaningful can't easily write that down in a runbook. But working through AI, that knowledge gets captured as it's applied -- in tools, in stored context, in documentation that writes itself as a byproduct of the work.

The Discoverability Problem

There's a failure mode here worth naming. You end up with so many tools that nobody knows what's available. We had a tool to check the metering status of a resource. Nobody knew it existed. When the question came up, the assumption was that we simply couldn't do that -- the capability was functionally absent because the knowledge of its existence didn't survive.

Not every operation needs a tool, either. Some problems are genuinely one-off. Building a tool for every edge case creates noise that makes the important tools harder to find. The solution isn't accumulation -- it's curation. Each tool needs to document why it exists, what it does, and when to use it. The toolset itself needs to be navigable. AI can help with discovery -- searching for relevant tools, suggesting them in context -- but the problem is real and not fully solved. A tool nobody can find is the same as no tool at all.

Compartmentalization, Not Omniscience

None of this requires giving AI unrestricted access to everything. In fact, it requires the opposite.

You wouldn't give the person who refills the coffee machine full access to your corporate email, production servers, and building security systems. That's not because they're untrustworthy -- it's because the access isn't relevant to their role, and unnecessary access creates unnecessary risk. This is basic compartmentalization. Every organization understands it for humans.

Yet with AI, people routinely hand over unrestricted shell access and say "go do your thing." Then they write think pieces about doomsday scenarios. The safety problem isn't AI. It's the access model.

The same compartmentalization principles apply. An AI agent investigating a database issue needs read access to that database's metrics and logs. It doesn't need write access to production. It doesn't need access to unrelated services. It doesn't need root. Scope the access to the task. Build confirmation steps around irreversible operations. Restrict dangerous commands behind purpose-built interfaces that validate inputs before executing.

This is what "build the tools" actually means. Not that every operation needs a bespoke CLI tool. It means that dangerous operations should be wrapped in interfaces that enforce guardrails -- and that those interfaces should be what AI calls, not raw commands. The same way you wouldn't let a human rm -rf a database directory without a confirmation step, you don't let an AI do it either. You build the interface once, and everything -- humans, automation, AI -- goes through it.

Scope the access. Capture the work. Build the interfaces. The actor changes. The principle doesn't.

References

Hoffman, R. R. (2008). Human factors contributions to knowledge elicitation. Human Factors, 50(3), 481--488. https://doi.org/10.1518/001872008X312152

Nonaka, I., & Takeuchi, H. (1995). The knowledge-creating company: How Japanese companies create the dynamics of innovation. Oxford University Press.