Beyond the Hype: What are Agents, and why they Still Need Seatbelts

TL;DR
AI agents have rocketed from weekend hacks to production workloads in under a year. The same glue that makes them wildly flexible—universal connectors like MCP, plain‑language manifests, and autonomous decision loops—also unlocks brand‑new attack paths. Before you go all in, bake in least privilege, deep observability, and a risk framework that evolves at the same pace as the tech.

Why Are We Talking About AI Agents Now?

Last spring, “AI agent” was a buzzword mostly confined to GitHub repos and Friday hack‑a‑thons. Twelve months later it’s on enterprise roadmaps and centre‑stage in every SaaS keynote.

What exactly is an AI agent?

LLM at the core—memory + goals wrapped around it.
The language model handles reasoning. A planning loop (decide → act → observe → decide…) plus short‑term memory turns that cognition into purposeful action.
Tool‑augmented, not tool‑agnostic.
Agents execute actions through tools—APIs, RPA bots, SQL queries, shell commands. Picture an intern who already knows every keyboard shortcut and never gets tired.
Autonomous‑ish.
Modern agents respect guard‑rails—task scopes, timeouts, approval gates—but they’ve evolved from one‑shot prompts to workflows that can run for minutes or hours.

An example MCP flow

sequenceDiagram
    participant ToolService as "Tool Service"
    participant AgentHost  as "Agent Host"
    participant LLM
    participant ExecBroker as "Execution Broker"

    ToolService ->> AgentHost: 1. Publish capability manifest
    AgentHost   ->> LLM: 2. Expose functions
    LLM         ->> AgentHost: 3. Return call spec
    AgentHost   ->> ExecBroker: 4. Sign & forward request
    ExecBroker  ->> ToolService: 5. API call (JWT/OAuth)
    ToolService -->> ExecBroker: 6. Response payload
    ExecBroker  -->> AgentHost: 7. Enforce policy & return result
    AgentHost   -->> LLM: 8. Feed result back to LLM

How They’ve Evolved in 12 Months

Then (mid‑2024)	Now (mid‑2025)
Proof‑of‑concepts in notebooks	Production frameworks (LangGraph, CrewAI, OpenAI Assistants) with state machines
Stateless prompts	Native Retrieval‑Augmented Generation (RAG) pipelines & vector stores
Manual JSON glue for APIs	Structured function calling built into OpenAI, Anthropic, Mistral
Ad‑hoc hosting	Deployed inside SaaS platforms (Salesforce, Microsoft 365, Atlassian) with SSO

The bottom line: agents have graduated from lab toys to business operators—so the security stakes are now board‑level.

Meet MCP — The Universal Connector for Agents

If REST is a set of private driveways, MCP (Model Context Protocol) is the public highway that lets any agent reach any tool without bespoke code.

graph TD
    %% Core reasoning loop
    LLM["LLM (reason & plan)"]
    Hub["Agent Host<br/>(MCP hub)"]
    LLM <--> Hub

    %% Tool “peripherals” plugged in via MCP
    CRM["CRM API"]
    ERP["ERP System"]
    Email["Email Service"]
    Git["GitHub Repo"]
    Payments["Payments Gateway"]

    %% USB-cable–style links
    Hub -- "MCP cable" --> CRM
    Hub -- "MCP cable" --> ERP
    Hub -- "MCP cable" --> Email
    Hub -- "MCP cable" --> Git
    Hub -- "MCP cable" --> Payments

Anatomy of an MCP Transaction

#	Actor	What Happens	Key Security Questions
1	Tool Service	Publishes a Capability Manifest (YAML/JSON) describing each action, parameters, return types, auth scheme.	Who reviewed the manifest? Does it over‑expose actions?
2	Agent Host	Ingests the manifest and converts it to structured function calls visible to the LLM.	Are signatures & versions validated?
3	LLM	Chooses an action, fills parameters, returns a call spec.	Is the prompt chain injecting extra params?
4	Execution Broker	Signs request (JWT/OAuth), enforces policy (rate limits, scopes), calls the API.	Are both what and why logged?
5	Tool Service	Executes, returns result; loop continues until the agent decides it’s done.	Is output filtered before it re‑enters the LLM?

Why MCP Took Off

Integration fatigue is real — one manifest beats a thousand SDKs.
Language‑native — LLMs can reason over plain‑text manifests.
Vendor‑neutral — W3C AI Interest Group is shepherding an open standard.

Flip‑side: a single mis‑scoped token or rogue manifest can expose an entire downstream estate.

Capability Expansion — From Changing a Tyre to Selling the Car

Imagine asking a friend who has never fixed a car to change one tyre. That’s today’s simplest agent: single task, narrow scope.

Step	New Capability Added	How It Maps to Agents
0	Change a tyre	Base agent with one tool (`replace_tyre`)
1	Check oil	New manifest entry gives the agent a `measure_oil` action
2	Run electrical checks	Add `battery_test` & `alternator_test` functions
3	Full diagnostics via another mechanic	Agent chains to a second agent specialised in OBD‑II scans (`diagnose_engine`)
4	Check historic damage	Integrates with an insurance‑database agent (`lookup_claims_history`)
5	Car valuation	Connects to a pricing‑API agent (`estimate_value`)
6	Replace windscreen	Books a glass‑repair agent (`schedule_windscreen`)
7	Deep clean	Calls detailing agent (`order_detailing`)
8	Sell the car	Hands off to marketplace agent (`list_vehicle_for_sale`)

Take‑away: Layering capabilities—and linking to other agents—means the same initial prompt (“make my car road‑ready”) can now spawn a full service, valuation, cosmetic upgrade, and marketplace listing. Power scales non‑linearly with each new manifest.Identity 101 — Autonomous vs Delegated Agents

graph LR
    A["Change Tyre<br/>(base agent)"] --> B["Check Oil<br/>(measure_oil)"]
    B --> C["Electrical Checks<br/>(battery & alternator)"]
    C --> D["Engine Diagnostics<br/>(OBD-II agent)"]
    D --> E["Historic Damage Lookup<br/>(insurance DB)"]
    E --> F["Car Valuation<br/>(pricing API)"]
    F --> G["Windscreen Replacement<br/>(glass-repair agent)"]
    G --> H["Deep Clean<br/>(detailing agent)"]
    H --> I["Sell Car<br/>(marketplace agent)"]

Attribute	Autonomous Identity	Delegated Identity
Control Flow	Agent acts on its own timetable	Agent acts on behalf of a named human
Permissions Source	Role granted directly to the agent	Inherits the delegating user’s rights
Accountability	Agent owner signs the risk	Human user answers for actions
Typical Use Case	Auto‑ordering supplies	Executive meeting scheduler
Top Risk	Over‑permissioned robots	Mis‑attribution of actions

Treat autonomous agents as new non‑human identities; grant least privilege from Day 1.

From Consumer Hype to Enterprise Reality

Consumer‑grade demos show the art of the possible. In business—especially regulated sectors—possible ≠ permissible.

Why it matters

Regulations still apply — GDPR, HIPAA, PCI‑DSS and friends aren’t impressed by AI hype.
Data mobility is risk mobility — MCP moves data between tools you don’t fully control, expanding the exfiltration blast radius.
“Move fast” needs a seatbelt — Document workflows, validate output, and build fallbacks.

Practical guard‑rails

Risk‑first design — Run every new MCP integration through your existing risk framework.
Controlled experimentation — Sandboxes with synthetic/masked data before production.
Granular observability — Log who, what, and why for each agent action; keep logs immutable and searchable.
Clear accountability — Assign an owner for every agent and its delegated scope.

Blind Spots That Keep CISOs Awake

Identity inversion — by the time a DB query hits Snowflake, the original user context is gone.
Manifest drift — delete_record sneaks into main, and the agent picks it up instantly.
Prompt injection via tool outputs — a poisoned payload steers the agent’s next step.
Token sprawl — revoking secrets turns into whack‑a‑mole.
Opaque cost & performance hotspots — autonomous loops fan out into hundreds of sub‑calls before any dashboard yells.

If you can’t see it, you can’t secure it. Instrument every hop.

Secure MCP Adoption Playbook (8‑Week Sprint Plan)

Week	Track	Actions	Success Metric
1	Discovery	Inventory all agents, tools, manifests; tag by data sensitivity.	100 % assets logged in CMDB
2	Threat Modelling	Run STRIDE (or similar) on the top‑3 critical workflows.	Risks registered & prioritised
3	Auth Hardening	Replace shared keys with OAuth2 client creds + per‑agent JWT claims.	Zero hard‑coded secrets in repos
4	Sandbox Validation	Replay high‑risk tasks with synthetic data in a controlled environment.	0 PII leaks detected
5	Policy Encoding	Deploy policy‑as‑code (OPA, Cedar) to enforce scopes, timeouts, and concurrency limits.	All MCP calls pass policy gate
6	Observability	Ship OpenTelemetry traces to the SIEM; enable latency & anomaly alerts.	24×7 dashboard live
7	Kill‑Switch	Implement one‑click disable per agent identity and test the escalation runbook.	MTTR < 10 min in red‑team drill
8	Training & Governance	Workshops for dev, ops, and compliance; update SDLC gates to include manifest review.	80 %+ attendance; SDLC checklist updated

Pro tip: Re‑run the playbook quarterly; the agent ecosystem mutates faster than classic software.

Going All In—Safely

AI agents and MCP are here to stay. They’ll automate drudge work, surface insights, and maybe even surprise us with creativity. But autonomy without oversight is just another word for risk.

Design for least privilege—exactly as you do for service accounts.
Instrument every action—capture identity context and intent.
Iterate controls—as fast as you iterate models.

Do that, and you can ride the AI wave without wiping out.

Beyond the Hype: What are Agents, and why they Still Need Seatbelts

Why Are We Talking About AI Agents Now?

What exactly is an AI agent?

An example MCP flow

How They’ve Evolved in 12 Months

Meet MCP — The Universal Connector for Agents

Anatomy of an MCP Transaction

Why MCP Took Off

Capability Expansion — From Changing a Tyre to Selling the Car

From Consumer Hype to Enterprise Reality

Why it matters

Practical guard‑rails

Blind Spots That Keep CISOs Awake

Secure MCP Adoption Playbook (8‑Week Sprint Plan)

Going All In—Safely

The Knowledge Ledger: How Sharing Your Edge Strengthens the Whole Team

Being in Balance Doesn’t Mean Being Lazy

Leave a Reply Cancel reply

Categories

Press ESC to close

Beyond the Hype: What are Agents, and why they Still Need Seatbelts

Why Are We Talking About AI Agents Now?

What exactly is an AI agent?

An example MCP flow

How They’ve Evolved in 12 Months

Meet MCP — The Universal Connector for Agents

Anatomy of an MCP Transaction

Why MCP Took Off

Capability Expansion — From Changing a Tyre to Selling the Car

From Consumer Hype to Enterprise Reality

Why it matters

Practical guard‑rails

Blind Spots That Keep CISOs Awake

Secure MCP Adoption Playbook (8‑Week Sprint Plan)

Going All In—Safely

The Knowledge Ledger: How Sharing Your Edge Strengthens the Whole Team

Being in Balance Doesn’t Mean Being Lazy

Leave a Reply Cancel reply

Categories

Why Are We Talking About AI Agents Now?

How They’ve Evolved in 12 Months

Capability Expansion — From Changing a Tyre to Selling the Car

Secure MCP Adoption Playbook (8‑Week Sprint Plan)