Marcus P. | mcp-seo.com

Labeling Synthetic Media: C2PA and Beyond

August 23, 2025 by Marcus P. #AI Content #C2PA #Labeling

As the internet floods with AI-generated content, the premium on human authenticity skyrockets. But how do you prove you are human? Or, conversely, how do you ethically label your AI content to maintain trust? Enter C2PA (Coalition for Content Provenance and Authenticity).

The Digital Watermark

C2PA is an open technical standard that allows publishers to embed tamper-evident metadata into media files (images, standard video, and soon text logs). This “digital watermark” proves:

Syndication in the Age of AI

August 10, 2025 by Marcus P. #Duplicate content #Meta tags #Syndication

Syndicating content to Medium, LinkedIn, or industry portals was a classic tactic in the Web 2.0 era. It got eyeballs. But in the age of AI training, it is a massive risk.

The Authority Trap

If you publish an article on your blog (DA 30) and syndicate it to LinkedIn (DA 99): The AI model scrapes both. During training, it deduplicates the content. It keeps the version on the Higher Authority Domain (LinkedIn) and discards yours. Result: The model learns the facts, but attributes them to LinkedIn, not you. You have lost the “citation credit.”

Serving JSON-LD to Bots and HTML to Humans

August 8, 2025 by Marcus P. #Cloaking #Schema.org #HTML

The ultimate form of “white hat cloaking” is Content Negotiation. It is the practice of serving different file formats based on the requestor’s capability.

HTTP Accept Headers

If a request includes Accept: application/json, why serve HTML?

Human Browser: Accept: text/html. Serve the webpage.
AI Agent: Accept: application/json or text/markdown. Serve the data.

The “Headless SEO” Approach

This approach creates the most efficient path for agents to consume your content without navigating the DOM. Instead of forcing the agent to:

Understanding Vector Distance for SEOs

July 26, 2025 by Marcus P. #Cosine Similarity #Vector Search

SEO used to be about “Keywords.” Now it is about “Vectors.” But what does that mean?

In the Agentic Web, search engines don’t just match strings (“shoes” == “shoes”). They match concepts in a high-dimensional geometric space.

The Vector Space

Imagine a 3D graph (X, Y, Z).

“King” is at coordinate [1, 1, 1].
“Queen” is at [1, 1, 0.9]. (Very close distance).
“Apple” is at [9, 9, 9]. (Far away).

Modern LLMs use thousands of dimensions (e.g., OpenAI’s text-embedding-3 uses 1536 dimensions). Every product description, blog post, or review you write is turned into a single coordinate in this massive hyper-space.

Implementing /llms.txt: The New Standard

July 22, 2025 by Marcus P. #LLMS.TXT #robots.txt

The /llms.txt standard is rapidly emerging as the robots.txt for the Generative AI era. While robots.txt was designed for search spiders (crawling links), llms.txt is designed for reasoning engines (ingesting knowledge). They serve different masters and require different strategies.

The Difference in Intent

Robots.txt: “Don’t overload my server.” / “Don’t confirm this duplicate URL.” (Infrastructure Focus)
Llms.txt: “Here is the most important information.” / “Here is how to cite me.” / “Ignore the footer.” (Information Focus)

Content of the File

A robust llms.txt shouldn’t just be a list of Allow/Disallow rules. It should be a map of your Core Knowledge.