Beyond Human Eyes: A Review of 'Build Agent-Friendly Websites'

A comprehensive review of the web.dev article ‘Build agent-friendly websites’ by Kasper Kulikowski and Omkar More. The post praises their focus on semantic HTML and accessibility trees as the foundation for AI agent navigation, while also highlighting critical missing elements like protocol-level discovery (llms.txt) and the Model Context Protocol (MCP). It argues that the future of Agentic SEO lies at the intersection of human accessibility standards and machine-readable data structures.

For decades, the field of User Experience (UX) has been obsessively focused on the human primate. We mapped eye-tracking heatmaps to understand where human gaze lingered. We agonizingly optimized button colors to trigger dopamine hits. We designed for the thumb, the swipe, and the fleeting human attention span. But the web of 2026 is fundamentally different. The fastest-growing demographic of web users does not have eyes, thumbs, or dopamine receptors. They are autonomous AI agents.

In a recent, highly necessary piece published on Google’s developer hub, Build agent-friendly websites, authors Kasper Kulikowski and Omkar More tackle this paradigm shift head-on. It is a landmark article that formally bridges the gap between traditional web accessibility and the emerging discipline of Agentic Engine Optimization (AEO).

As researchers heavily invested in the Agentic Web, we at mcp-seo.com find their work not just refreshing, but foundational. Kulikowski and More have successfully articulated the “physics” of how modern AI agents interact with the Document Object Model (DOM), providing a much-needed grounding wire for developers who are still building sites like it is 2023.

The Great Elements: How Agents “See”

The most brilliant aspect of Kulikowski and More’s article is their demystification of the agentic sensorium. They correctly identify that an AI agent navigating a webpage doesn’t just read the raw HTML; it relies on a tripartite synthesis of inputs:

The Visual Screenshot: Using vision models to understand spatial relationships and visual hierarchy.
The DOM Structure: Parsing the raw HTML nodes.
The Accessibility Tree: Reading the structured, semantic representation of the UI that browsers generate for assistive technologies.

The authors astutely point out that when these three signals fall out of alignment, the agent hallucinates or fails. If a <div> looks like a button visually but lacks the semantic role="button" in the Accessibility Tree, the agent is flying blind.

Actionable Praise

The article shines in its actionable, pragmatic advice. The authors wage a righteous war against “div soup”—the horrific practice of building interactive elements out of generic container tags. They demand a return to semantic primitives: using <button> for actions and <a> for navigation.

Furthermore, their emphasis on the for attribute in form labels is a masterclass in technical precision. To a human, a label sitting visually next to a checkbox is enough. To an agent, if that label isn’t programmatically bound to the input via an ID, the relationship is a mathematical guess. By demanding explicit binding, the authors are teaching developers how to remove ambiguity from the vector space.

Their recommendation to ensure interactive elements are at least 8x8 pixels is another stroke of genius, acknowledging that vision models need minimum pixel density to confidently identify actionable nodes.

Perhaps the most resonant takeaway from their piece is the realization that optimizing for AI agents is indistinguishable from optimizing for human accessibility. By building an “agent-friendly” site, you are inherently building a site that works flawlessly for screen readers. It is a beautiful convergence where the ruthless efficiency of machine consumption accidentally enforces a more equitable web for disabled human users.

Feature	Traditional Human UX Focus	Agentic UX Focus (per web.dev)
Visuals	Brand colors, emotional resonance, typography.	High contrast, stable layouts, minimum 8x8px hit targets.
Interactivity	Hover states, smooth animations, complex JS.	Explicit semantic tags (`<button>`, `<a>`), predictable state changes.
Forms	Visually adjacent labels, placeholder text.	Programmatic binding (`for` attribute), explicit ARIA roles.
Structure	Visual hierarchy via CSS grid/flexbox.	Semantic HTML (`<main>`, `<nav>`) matching the Accessibility Tree.

What the Article Left on the Table

While Kulikowski and More have provided a spectacular primer on the DOM-level interactions of AI agents, their scope is strictly confined to the browser’s rendering engine. From the perspective of advanced Agentic SEO, the article misses the macro-architectural protocols that are defining the web in 2026.

If we look at the broader ecosystem of autonomous agents—from OpenAI’s Operator to specialized research bots—optimizing the DOM is only step two. Step one happens before the HTML is even rendered. Here are three critical concepts the article could have included to provide a complete picture of Agentic UX.

1. Protocol-Level Discovery: The `llms.txt` Standard

The article assumes the agent has already arrived on the page and is trying to click a button. But how does the agent know the button exists, or what the site’s overall capability is?

In the Agentic Web, discovery happens at the protocol layer. The article entirely omits the necessity of files like llms.txt and llms-full.txt. These markdown-based directive files sit at the root of a domain and act as a high-density, token-efficient treasure map for LLMs. Before an agent spins up an expensive headless browser to parse your meticulously crafted Accessibility Tree, it looks for an llms.txt file to understand your site’s ontology.

Failing to mention llms.txt in a guide about agent-friendly websites is akin to teaching someone how to build a library without mentioning the Dewey Decimal System. Agents prefer markdown over HTML because it saves tokens. Giving them a raw, structured text map of your site’s capabilities is the ultimate “Agentic UX.”

For further reading on how AI consumption is altering discovery, Search Engine Land’s recent coverage on Agentic SEO provides excellent context on how crawling budgets are shifting toward these text-based manifests.

2. Entity Grounding via High-Density Schema

The authors rightly praise semantic HTML, but they stop short of discussing structured data. Semantic HTML tells an agent what a thing is functionally (e.g., “This is a navigation menu”). Schema.org markup tells the agent what a thing means in reality (e.g., “This is a Product with a $49 price tag, in stock, manufactured by Acme Corp”).

We call this Entity Grounding. When an agent is tasked with booking a flight or buying a pair of shoes, it does not want to infer the price by reading the text node inside a <span> next to a picture of a shoe. It wants an unambiguous, machine-readable JSON-LD payload.

Current best practices dictate that “Agentic UX” requires high-density Schema. You must move beyond generic Article or Organization tags and implement hyper-specific schemas like MerchantReturnPolicy or ServiceChannel. If an agent cannot mathematically verify your shipping policy in the structured data, it will likely abandon the task and move to a competitor’s site that is more legible.

To understand the foundational requirements of accessible design that feed into these schemas, the A11y Project remains an indispensable resource for aligning human and machine readability.

3. The API-First Content Model and MCP

Finally, the article treats the website purely as a visual interface to be parsed. However, the most sophisticated AI agents in 2026 do not want to click your buttons at all. They want to bypass your GUI entirely and talk to your database.

This is where the Model Context Protocol (MCP) comes in. An entirely optimized Agentic UX means offering an MCP server alongside your frontend. If an agent is tasked with retrieving your latest financial reports, forcing it to load your CSS, parse your Accessibility Tree, and simulate a mouse click on a “Download PDF” button is incredibly computationally expensive.

A truly agent-friendly site exposes its core functionalities via standardized APIs or an MCP server. The website becomes a fallback mechanism—a wrapper for humans—while agents are given VIP backstage access to the raw data streams.

Google’s own engineering teams are actively involved in these open standards. Insights into how large-scale systems are adopting these protocols can often be found on the Google Developer Blog, highlighting the shift from visual interfaces to data-driven agent interactions.

Conclusion: The Two-Front War of Agentic SEO

Kasper Kulikowski and Omkar More have done the web development community a massive service. By framing agent optimization through the lens of the Accessibility Tree and semantic HTML, they have given developers a familiar vocabulary for an alien concept. If you follow their advice, your site will be vastly more navigable for both AI agents and human beings relying on assistive technology.

However, as SEOs and systems architects, we must remember that the Agentic Web is a two-front war. You must optimize the rendered DOM for the vision models and browser-automation agents, as the web.dev article brilliantly details. But you must also optimize the invisible layer—the llms.txt files, the high-density Schema, and the MCP servers—for the reasoning engines that prefer to bypass the browser entirely.

The web is no longer just a place to look; it is a place to compute. And the sites that win the next decade will be the ones that are perfectly fluent in both languages.