---
title: "Agent Experience: Better Tools Beat More Connections"
description: "Connecting every system to an agent is not the same as making the agent effective. The next leverage point is agent experience: tools designed around intent, context, recovery, and trust."
pubDate: "2026-06-02"
heroImage: "/blog/agent-experience-hero.jpg"
tags: ["AI Agents", "Agent Experience", "Tooling"]
---

The easiest way to make an agent look powerful is to connect it to everything.

Give it a protocol. Give it a registry. Give it access to docs, tickets, calendars, databases, repositories, dashboards, and workflow systems. Let it discover tools at runtime. Let it defer tool loading until the task seems relevant.

That can be useful plumbing.

It is not agent experience.

Agent experience is the quality of the working environment we create for an AI agent. It is the shape of the tools, context, feedback, constraints, and recovery paths that determine whether the agent can actually do useful work.

Most teams are still treating agents like users with infinite patience and perfect judgment. They are not. Agents need good product design too.

## Connection Is Not Capability

MCP and similar integration layers solve a real problem: they make tools easier to expose, discover, and compose.

But access is not the same as affordance.

When we connect an agent to a broad API, we have not automatically given it a good tool. We may have given it a maze. The agent now has to infer which operation matters, what the business meaning is, which parameters are safe, what failure means, and when to stop.

That is not intelligence. That is poor interface design.

Humans also struggle with tools that expose implementation detail instead of intent. We do not hand a support operator raw database tables and call it a workflow. We build screens, actions, guardrails, previews, confirmations, defaults, and undo paths.

Agents deserve the same care.

## Better Tools Are Narrower Than APIs

An agent-facing tool should usually be smaller than the system it wraps.

Instead of exposing "Jira," expose "find open incidents for this customer." Instead of exposing "database query," expose "check whether this order is eligible for refund." Instead of exposing "send email," expose "draft customer update for review."

The best tools carry product intent in their contract.

A good agent tool says:

- What job it performs
- What inputs it accepts
- What assumptions it makes
- What permissions it enforces
- What output shape it guarantees
- What errors mean
- What the agent should do next

<section class="tool-blueprint tool-blueprint--bad" aria-labelledby="raw-refund-title">
	<div class="tool-blueprint__intro">
		<p>Bad Surface</p>
		<h3 id="raw-refund-title">Raw Refund API Exposed to the Agent</h3>
		<span>This looks powerful because the agent can access everything. In practice, it forces the agent to reason over implementation details.</span>
	</div>
	<div class="tool-blueprint__flow" aria-label="Raw refund API tool surface">
		<div class="tool-blueprint__step">
			<b>Endpoint</b>
			<span>/orders, /payments, /refunds, /ledger</span>
		</div>
		<div class="tool-blueprint__arrow" aria-hidden="true">→</div>
		<div class="tool-blueprint__step">
			<b>Inputs</b>
			<span>Any ID, flags, metadata, optional overrides.</span>
		</div>
		<div class="tool-blueprint__arrow" aria-hidden="true">→</div>
		<div class="tool-blueprint__step">
			<b>Permissions</b>
			<span>Buried in backend behavior.</span>
		</div>
		<div class="tool-blueprint__arrow" aria-hidden="true">→</div>
		<div class="tool-blueprint__step tool-blueprint__step--danger">
			<b>Side Effects</b>
			<span>Read, write, reverse payment, notify customer.</span>
		</div>
		<div class="tool-blueprint__arrow" aria-hidden="true">→</div>
		<div class="tool-blueprint__step">
			<b>Errors</b>
			<span>400, 403, 409, timeout.</span>
		</div>
	</div>
	<div class="tool-blueprint__contract">
		<div>
			<b>Agent Confusion</b>
			<span>Which endpoint should it call first? What does each flag mean?</span>
		</div>
		<div>
			<b>Product Risk</b>
			<span>The agent can accidentally move money or contact a customer.</span>
		</div>
		<div>
			<b>No Recovery</b>
			<span>A generic error tells the agent nothing about what to do next.</span>
		</div>
	</div>
</section>

This is the surface I want teams to be suspicious of. It makes the integration feel complete, but it pushes product judgment, permissions, side effects, and recovery into the model's guesswork.

<section class="tool-blueprint" aria-labelledby="refund-tool-title">
	<div class="tool-blueprint__intro">
		<p>Concrete Example</p>
		<h3 id="refund-tool-title">Tool: Check Refund Eligibility</h3>
		<span>A narrow tool an agent can use before it drafts a customer refund response.</span>
	</div>
	<div class="tool-blueprint__flow" aria-label="Refund eligibility tool flow">
		<div class="tool-blueprint__step">
			<b>Intent</b>
			<span>Should we refund this customer?</span>
		</div>
		<div class="tool-blueprint__arrow" aria-hidden="true">→</div>
		<div class="tool-blueprint__step">
			<b>Input</b>
			<span>Only order ID and customer ID.</span>
		</div>
		<div class="tool-blueprint__arrow" aria-hidden="true">→</div>
		<div class="tool-blueprint__step">
			<b>Guardrail</b>
			<span>Act as the current employee.</span>
		</div>
		<div class="tool-blueprint__arrow" aria-hidden="true">→</div>
		<div class="tool-blueprint__step tool-blueprint__step--tool">
			<b>Focused Tool</b>
			<span>Check policy. Do not send email or update records.</span>
		</div>
		<div class="tool-blueprint__arrow" aria-hidden="true">→</div>
		<div class="tool-blueprint__step">
			<b>Output</b>
			<span>Yes or no, reason, next step.</span>
		</div>
	</div>
	<div class="tool-blueprint__contract">
		<div>
			<b>Good Error</b>
			<span>Missing order? Ask for the order ID. Do not guess.</span>
		</div>
		<div>
			<b>Recovery</b>
			<span>Stale policy data? Stop and ask a human to verify.</span>
		</div>
		<div>
			<b>Audit Trail</b>
			<span>Record who asked, what policy was used, and why it answered yes or no.</span>
		</div>
	</div>
</section>

This is the difference between a backend API and an agent tool. The agent does not get "access to refunds." It gets one small capability with a clear job, safe boundaries, useful errors, and an output it can reason about.

That last part matters. Agents work better when tools explain the next move. A vague failure like "400 Bad Request" creates guesswork. A useful failure says, "This customer has no active subscription. Ask for an account email or stop the refund workflow."

Tool design is instruction design.

## Tool Abundance Creates Decision Debt

Giving an agent many tools feels like giving it more power. Often it gives the system more ways to be confused.

Every tool adds decision debt:

- When should the agent use it?
- When should it avoid it?
- How does it compare with similar tools?
- What data does it reveal?
- What side effects can it create?
- How do we evaluate correct use?

If the answer is "the model will figure it out," the system boundary is too loose.

Senior AI engineering is not just connecting more tools. It is deciding which tools should exist, what they mean, and how the agent should reason about them.

The useful move is usually curation:

- Fewer tools with clearer names
- Task-specific tools over generic APIs
- Read-only tools before write tools
- Structured outputs over prose blobs
- Explicit risk levels for side effects
- Confirmation paths for irreversible actions

An agent with ten excellent tools will often outperform an agent with a hundred ambiguous ones.

![A hand-drawn agent moving from a crowded wall of ambiguous integrations to a small shelf of curated tools.](/blog/agent-experience-tool-abundance.jpg)

_The agent does not need every connector in the building. It needs the few tools that make the next decision obvious._

## Deferred Loading Does Not Fix Bad Tool Design

Deferred tool loading can reduce context pressure. It can keep the agent from seeing every possible tool at once. It can make large tool ecosystems more manageable.

But deferral changes when a tool appears, not whether the tool is good.

If the underlying tool is ambiguous, overbroad, poorly named, under-documented, or dangerous, loading it later only delays the problem. The agent still has to interpret a weak contract at the moment of action.

The deeper question is not, "Can the agent discover this tool?"

The deeper question is, "Can the agent understand the job this tool performs, use it safely, recover from failure, and explain what happened?"

That is agent experience.

## Design for Intent, State, and Recovery

Agent tools should be designed around the workflow, not the backend system.

For each tool, I want to know three things.

First: what intent does this tool serve?

The tool name and description should reflect the agent's goal, not the internal service boundary. "createRefundAssessment" is better than "callPaymentsV2." It tells the agent what kind of reasoning belongs before and after the call.

Second: what state does this tool require and produce?

Agents need stable handles. If a tool returns ten similar records with weak identifiers, the next step becomes fragile. Outputs should include the evidence needed to continue: IDs, timestamps, confidence, policy flags, links, and human-readable summaries.

Third: how does the agent recover?

Every production tool should describe recoverable failures. Missing permission, stale data, duplicate records, validation errors, rate limits, and partial success should not collapse into generic exceptions.

Recovery is part of the interface.

![A hand-drawn agent and engineer holding a toolbox with input, output, permission, recovery, and logging cues.](/blog/agent-experience-tool-contract.jpg)

_A good agent-facing tool is not just callable. It carries intent, safety, state, and recovery in the contract._

## Make Tool Quality Observable

If tools are the agent's environment, tool quality needs observability.

I want dashboards and evals that show:

- Which tools are called most often
- Which tools are avoided even when they should be used
- Which tools produce retries
- Which errors cause agent loops
- Which outputs lead to bad final answers
- Which tools increase latency or cost
- Which tool descriptions correlate with misuse

This is where agent experience becomes an engineering discipline.

You cannot improve what you cannot see. If the agent keeps making bad decisions, the cause may not be the model. It may be that the environment gives the model weak affordances.

## Treat Agents as Operators

The best mental model I have found is to treat the agent like a new kind of operator.

Not a human operator. Not a service account. Something in between.

An operator needs:

- A clear job
- A small set of reliable tools
- Context at the moment of work
- Permission boundaries
- Feedback when something fails
- A way to ask for help
- A log of what happened

That framing moves the team away from "how many integrations can we attach?" and toward "what operating environment would make this agent competent?"

That is a healthier question.

## The Better Architecture

I am not arguing against MCP or tool protocols. They are useful standards for making tools available.

I am arguing against confusing availability with usefulness.

The better architecture has two layers:

1. A connection layer that makes tools discoverable and callable.
2. An experience layer that makes tools understandable, safe, and effective.

Most of the value is in the second layer.

That layer includes naming, schemas, permission checks, scoped capabilities, examples, evals, observability, confirmations, recovery semantics, and product judgment about what should not be a tool at all.

The future of agents will not be won by connecting everything. It will be won by designing the working environment so well that agents can do the right thing with less guesswork.

That is agent experience.
