Effect of Nofollow on LLM Training

In the traditional world of SEO, the rel="nofollow" attribute was a simple, binary instruction. It told Googlebot: “Don’t follow this link, and certainly don’t pass any PageRank through it.” It was the specific tool we used to sculpt authority, manage crawl budgets, and disavow paid relationships.

But the Agentic Web does not run on PageRank alone. It runs on Tokens.

As we transition from optimization for retrieval (search engines) to optimization for inference (LLMs), the rules of the nofollow attribute are being rewritten. The comfortable assumption that a nofollow link protects you from the “bad neighborhood” or prevents a competitor from benefiting from your content is dangerously outdated.

Read more →

Semantic HTML is LLM Training Fuel: Why 'Div Soup' Poisons Models

In the early days of the web, we were told to use Semantic HTML for accessibility. We were told it allowed screen readers to navigate our content, providing a better experience for the visually impaired. We were told it might help SEO, though Google’s engineers were always famously coy about whether an <article> tag carried significantly more weight than a well-placed <div>.

In 2025, that game has changed entirely. We are no longer just optimizing for screen readers or the ten blue links on a search results page. We are optimizing for the training sets of Large Language Models (LLMs).

Read more →