April 25, 2026·7 min read·technical / standards / product-update

6 new agent-readiness checks — and why most scanners still don't run them

We just added checks for MCP server cards, Agent Skills indexes, Web Bot Auth, Cloudflare Content Signals, agentic commerce protocols, and Markdown content negotiation. Here's what each one is, what the ideal response looks like, and why ignoring them in 2026 is a mistake.

byAmit Gupta

Listen to this article

Most website-audit tools you've used were designed to answer one question: does a human skimming a search result get what they want? That was a fine bar in 2015. In 2026, the fastest-growing traffic on the web is agentic — ChatGPT operator mode, Claude Code, Cursor, Perplexity's assistants, dozens of vertical agents — and they don't navigate like humans do. They read robots.txt. They probe /.well-known/*. They want JSON, not nav menus. They follow content-negotiation headers you probably haven't thought about since a 2003 RFC.

So we spent the last sprint adding six new checks — all covering emerging agent-oriented standards that almost no other scanner runs today. Three are well-known discovery artifacts that let agents introspect your capabilities. Two are policy + identity primitives that smooth WAF-level friction. One is a representation negotiation that matters more than its obscurity suggests.

Here's what changed, in plain English first, then in the detail your engineers will want.

For product people: the short version

Check	What it answers for agents	Who should care
MCP server card	"Do you expose tools I can call?"	SaaS products, API platforms, developer docs
Agent Skills index	"What discrete skills can I invoke on your site?"	SaaS, ecommerce, docs — anyone with an API
Web Bot Auth directory	"Are you a modern origin that supports signed bot traffic?"	Any site behind a WAF that cares about verified bot access
Cloudflare Content Signals	"May I use your content for AI training / inference?"	Every public-facing site
Agentic commerce protocols (ACP / UCP / MPP / x402)	"Can you be paid by an agent, not a human?"	Commerce, SaaS on metered billing, API providers
Markdown content negotiation	"Do you serve a clean Markdown body when I ask for one?"	Content publishers, docs sites, anyone who wants their content cited

Below, the detail.

1. MCP server cards — `/.well-known/mcp.json`

What it is. Anthropic's Model Context Protocol (MCP) gives agents a uniform way to discover and call "tools" (functions) on your service. The server card at /.well-known/mcp.json is the manifest: name, description, tools array, optional resources, auth.

The ideal response. Valid JSON, Content-Type: application/json, with at least name and a non-empty tools[]. Each tool should carry name, description, inputSchema (JSON Schema), endpoint, method.

{
  "protocolVersion": "2024-11-05",
  "name": "acme-invoicing",
  "description": "Invoice and subscription operations",
  "capabilities": {
    "tools": [
      {
        "name": "list_invoices",
        "description": "List invoices for the authenticated customer.",
        "inputSchema": { "type": "object", "properties": {} },
        "endpoint": "https://acme.com/api/invoices",
        "method": "GET"
      }
    ]
  }
}

Why it matters. Once your card is live, your product is natively installable in Cursor, Claude Desktop, and (soon) ChatGPT. That's a new distribution surface that didn't exist 18 months ago.

How we check it. HEAD the path with catch-all-SPA validation, then GET the body and parse as JSON. We validate name (or serverName / title) plus a non-empty tools array. A SPA that returns 200 text/html at /.well-known/mcp.json fails the check — which is how most sites fail it today without realizing.

2. Agent Skills index — `/.well-known/agent-skills/index.json`

What it is. An enumerable list of discrete skills a site exposes — lighter than MCP's full protocol, heavier than a single OpenAPI blob. Each skill carries its endpoint, method, auth, and I/O schema.

The ideal response. JSON with a skills[] array; each entry has name, description, endpoint, method, input_schema.

Why it matters. Agent-skills indexes are the "jobs-to-be-done" layer for AI agents. An orchestrator can read your index and know: here are the four jobs this site does well, here's how I call them. Compared to scraping your nav menu and guessing, it's orders of magnitude faster — and a lot less error-prone.

How we check it. Same shape as MCP — HEAD + GET + JSON validate. We count skills and warn if the array is empty.

3. Web Bot Auth signing directory — `/.well-known/http-message-signatures-directory`

What it is. An IETF draft (draft-meunier-http-message-signatures-directory) that publishes a JWKS-like directory of public keys your site will accept on signed bot requests. Think of it as CORS for bot identity.

The ideal response. JSON with a keys[] array (or an empty array plus a note explaining you don't verify signed bots yet).

Why it matters. This is the mechanism that lets WAFs safely relax rules for verified bots — GPTBot, ClaudeBot, etc. — without opening the door to spoofing. Publishing the endpoint (even empty) signals you're a Web-Bot-Auth-aware origin. It's a small act that will matter more as more CDNs enforce signed identity.

How we check it. HEAD the path, GET the body, validate JSON. Empty keys still passes — we flag a warning only if the body is actively malformed.

4. Cloudflare Content Signals in robots.txt

What it is. Cloudflare's Content Signals spec extends robots.txt with a machine-readable declaration of how your content may be used: search indexing, AI search answers, AI training. Syntax:

Content-Signal: search=yes, search-ai=yes, ai-train=no

Why it matters. The "should AI be allowed to train on this?" question is no longer hypothetical. Enterprise legal teams are starting to ask vendors for machine-readable answers. A one-line declaration in robots.txt is the cheapest, most defensible answer you can give — and it helps AI systems route your content correctly.

How we check it. Parse robots.txt for Content-Signal: (or Content-Usage:) directives, parse the declared values, and surface them in evidence so you can verify the policy on a re-scan.

5. Agentic commerce protocols — ACP, UCP, MPP, x402

What it is. Four overlapping standards for letting agents pay. ACP = Agentic Commerce Protocol. UCP = Universal Commerce Protocol. MPP = Merchant Payments Protocol. x402 = the HTTP "Payment Required" status repurposed as a handshake for micropayments.

The ideal response. A JSON capability document at /.well-known/{acp,ucp,mpp,x402}.json describing your protocol version and supported actions.

Why it matters. Agents that can pay unlock a different economic model — per-query API access, per-purchase completion fees, per-booking transactions. Early adopters are going to get disproportionate agent traffic simply because they're the only ones who can be transacted with programmatically. This is a land-grab window.

How we check it. HEAD each well-known path with catch-all-SPA rejection, then GET the body and parse as JSON. We also check the homepage HTML for protocol mentions as a secondary signal.

6. Markdown content negotiation — `Accept: text/markdown`

What it is. An agent sends Accept: text/markdown on a GET request. An agent-aware origin returns the same page's content as Markdown — stripped of layout, CSS, tracking pixels, and JavaScript.

The ideal response. Content-Type: text/markdown with a clean, citable Markdown body. Or, at minimum, a body that looks like Markdown (starts with #, has lists, isn't HTML).

Why it matters. LLMs tokenize Markdown far more efficiently than HTML, and the structural cues (headings, lists, code fences) survive into the agent's context window exactly as you'd want them cited. Every extra kilobyte of <div> scaffolding is context you're throwing away.

How we check it. GET the homepage with Accept: text/markdown, text/plain;q=0.8, text/html;q=0.1, inspect the response's content-type and body shape. If we see HTML back, the check fails — and the fix is usually a five-line edge-middleware rule.

How to fix all six in a weekend

Two clean paths:

If your site is static-ish (marketing site, docs, small SaaS): add four files to your public/.well-known/ directory — mcp.json, agent-skills/index.json, http-message-signatures-directory, one line in robots.txt. Deploy. Done. That ticks five of six.
If you want the Markdown-negotiation win too: add a Cloudflare Worker (or equivalent edge rule) that intercepts GET / with Accept: text/markdown and returns a curated Markdown representation. The fastest prototype is to serve your llms.txt at / on that negotiation — we suspect most sites won't notice the downgrade.

What we did ourselves

Eating our own dogfood: after shipping these checks, we published our own /.well-known/mcp.json (5 tools), /.well-known/agent-skills/index.json (5 skills), Web Bot Auth directory, and Cloudflare Content Signals in robots.txt. Our own Agent Readiness Technical pillar jumped from ~75 to 97/100. We think we're the first AI-readiness scanner whose findings agents can both audit and call directly.

Run the checks on your site

All six live now on every scan. No plan changes, no upgrades, nothing to turn on — just run a free scan against your site and look for these keys in the Technical pillar:

mcp_server_card
agent_skills_index
web_bot_auth
content_signals
commerce_protocols
markdown_negotiation

Each ships with evidence (the exact URL, status, content-type, and body snippet we saw) plus a Copy LLM fix prompt button — paste the prompt into Cursor or Claude and you'll usually have a fix in minutes.

See you on the leaderboard.

Now it's your turn

How does your site score?

Run the free Agent Readiness Score™ on your homepage. Weighted for your business type. No signup required for your first scan.

Start a free scan