Your Next Customer Might Be an AI Agent. Will You Let Them In?

Amazon's lawsuit against Perplexity AI in November 2025 crystallizes the problem. Amazon accuses Perplexity of using AI agents to access customer accounts without disclosure, masking automated activity as human browsing—violations of computer fraud laws and terms of service**. Perplexity argues Amazon is bullying** innovation in user-chosen AI assistants**. The legal fight reveals** what digital storefronts now face: deciding which automated visitors get in and how to verify them.

This isn't theoretical. AWS Bedrock AgentCore ships with Web Bot Auth support to reduce CAPTCHA friction. Cloudflare blocks AI crawlers by default for new customers as of July 1, 2025. For legal teams, that means documentation requirements around agent access policies and delegated authority scoping. For product teams, it means reconfiguring CDN rules, exposing structured data that machines can read, and building transaction endpoints that agents invoke programmatically instead of navigating checkout flows designed for humans.

Dazza Greenwood's November 2025 analysis "Existing on the New Web" maps three protocol layers that teams need to configure: accessibility controls, legibility standards, and actionability endpoints. Here's what that means in practice.

Accessibility layer: distinguishing legitimate agents from threat traffic

Most web infrastructure operates on a binary access model inherited from two decades of anti-scraping defense: humans get through, bots get blocked. The framework assumes all automated traffic represents hostile activity—credential stuffing, content theft, competitive intelligence gathering. That design worked when "bot" meant malicious actor, but the category has split into distinct functions with different access requirements and business implications.

Training crawlers like Common Crawl's CCBot, OpenAI's GPTBot, and Anthropic's ClaudeBot extract content periodically to build model datasets. These operate without real-time user intent and harvest information for corpus assembly rather than immediate retrieval. Retrieval bots like Perplexity and ChatGPT with browsing fetch live information to augment AI-generated answers, surfacing content in synthesized responses when users ask questions. Transaction agents act as direct proxies for users executing specific commercial goals: booking reservations, comparing insurance quotes, scheduling appointments, placing orders. This third category carries purchasing authority and represents actual customer demand.

The access control problem arises because these agent types look identical at the network perimeter. User-agent strings are trivially spoofable. Traffic patterns can be mimicked. Traditional bot detection mechanisms can't distinguish a legitimate agent placing an order from a malicious scraper harvesting pricing data. Legal teams need agent access policies that specify which automated traffic to permit under what verification conditions. Product teams need to reconfigure CDN defaults that may be blocking valuable transaction pathways without anyone noticing. An anonymized healthcare provider experienced reduced visibility in AI-assisted medical facility searches because default CDN settings blocked retrieval bots—a scenario illustrating risks in improperly configured AI crawler access controls.

The verification infrastructure solving this problem operates through cryptographic identity protocols rather than behavior heuristics. IETF's Web Bot Auth draft specifies how agents prove identity within HTTP requests using signed authentication tokens that can't be forged. The mechanism creates a machine-readable passport that distinguishes verified agents from spoofed traffic. AWS Bedrock AgentCore implements this protocol today to reduce CAPTCHA challenges when its agents browse protected sites. Emerging OAuth 2.1 profiles are exploring delegated authority flows where access tokens contain identifiers for both the human user who granted permission and the agent performing the action. This prevents impersonation while maintaining audit trails showing both who authorized access and what system executed the transaction.

Legibility layer: structured data that agents can parse and act on

Accessibility determines whether an agent can reach a site; legibility determines whether the agent can extract actionable information once it arrives. Content designed for human visual parsing—nested menus, JavaScript-dependent interfaces, styling that conveys meaning through layout—becomes opaque when an agent attempts programmatic interpretation. The agent sees HTML tags but can't determine which text represents pricing versus promotional copy, product specifications versus customer reviews, booking endpoints versus general contact forms.

The llms.txt convention addresses this gap through a simple mechanism: a Markdown file at the root domain path /llms.txt that maps the site's most important content for machine interpretation. The file format includes section headers identifying key pages, with links to clean Markdown versions of documentation, pricing, policies, and support resources. An example implementation specifies product overview, API documentation, FAQ, terms of service, and contact information with direct links to each resource. The file acts as a structured index that agents consult first rather than attempting to parse arbitrary page layouts.llmstxt

Adoption of llms.txt has grown, tracked by directories such as llmstxt.site and directory.llmstxt.cloud. GitBook published tutorials for automatic generation. CMS platforms are building native llms.txt support. The standard emerged from developer practice rather than formal specification, which accelerated uptake but created documentation gaps around optimal file structure and update cadence. Legal teams need to determine what content belongs in llms.txt files and establish review processes for the Markdown-formatted pages they reference. Product teams need to implement file generation, usually as part of content management workflows, and ensure the index stays current as site structure evolves.

Schema.org markup provides the second legibility mechanism. JSON-LD structured data embedded in page HTML identifies entities explicitly: products with prices and availability, organizations with locations and contact details, FAQs with question-answer pairs, events with dates and registration information. Agents parse this markup directly to extract facts without interpreting layout or inferring meaning from visual context. A product page with schema markup tells the agent programmatically what costs $49.99, what ships in 3-5 days, what customer ratings exist, without requiring the agent to distinguish sales copy from specifications.

Actionability layer: transaction endpoints that agents invoke rather than screen-scrape

Legibility makes information readable; actionability makes services usable. An agent needs to complete transactions, not just consume content. If the only path to purchasing a product involves clicking through a multi-step JavaScript checkout flow designed for human navigation, the agent can't place the order. If scheduling an appointment requires phone calls or email exchanges, the agent can't book the consultation. The site becomes read-only from the agent's perspective, which blocks the transaction even when the agent has legitimate purchasing authority and the user wants to complete the action.

APIs with documented endpoints solve this problem by exposing services as callable tools. An e-commerce site that provides an API for checking inventory, adding items to cart, and initiating checkout gives agents a transaction pathway that doesn't depend on navigating a visual interface. The agent invokes endpoints programmatically with proper authentication, completing purchases on behalf of users who've granted permission. OpenAPI specifications (previously called Swagger) document these endpoints in machine-readable format that agents consume directly, treating APIs as tool libraries they can call with specific parameters.

Model Context Protocol (MCP) formalizes this pattern. Sites expose MCP-compatible endpoints that agents discover and invoke according to protocol specifications. A booking system becomes an MCP tool offering "check availability" and "create reservation" functions. An inventory database becomes a tool supporting "search products" and "verify stock" queries. Agents interact with these services through standardized interfaces rather than requiring custom integration for each vendor. Microsoft and Cloudflare are developing endpoint patterns where sites expose conversational interfaces like /ask alongside agent tool endpoints like /mcp, both backed by the same retrieval infrastructure.

The actionability gap carries commercial consequences. When a user instructs an agent to "order more of that coffee I liked last month," the agent must access order history (with permission), identify the product, check availability, and complete a purchase. If your site lacks the API endpoints to support this flow, the agent finds a competitor selling similar coffee whose infrastructure permits programmatic ordering. The customer wanted your product but completed the transaction elsewhere because your systems couldn't accommodate agent-mediated commerce. In professional services, an agent tasked with scheduling a commercial real estate consultation needs to query attorney availability and book appointments. A website offering only a contact form with no structured availability data or booking API doesn't enter the agent's consideration set, regardless of attorney expertise.

Reference

Greenwood, Dazza. "Existing on the New Web: Your Next Customer Might Be an AI Agent. Will You Let Them In?" Published on dazzagreenwood.com, November 2025.

Building AI systems that work—legally and practically—starts with seeing what others miss. More at kenpriore.com.

Accessibility layer: distinguishing legitimate agents from threat traffic

Legibility layer: structured data that agents can parse and act on

Actionability layer: transaction endpoints that agents invoke rather than screen-scrape

You might also like

Agentic AI systems don't wait for instructions—they decide and act independently

AI productivity research shows learning effects that outlast tool use