{
  "title": "Before Calling It \"AI Optimization\": Why I Added llms.txt and WebMCP to My Engineering Blog",
  "description": "There is no settled answer yet for what an LLM- or agent-friendly web should look like, so I added llms.txt, llms-full.txt, and WebMCP to this blog as an experiment. Less a bet on future ranking gains than a small test of giving non-human readers an easier way in — here is what I built and why.",
  "locale": "en",
  "slug": "llms-txt-webmcp",
  "url": "https://engineer-blog.tomoki-ttttt.workers.dev/en/articles/llms-txt-webmcp/",
  "publishedAt": "2026-06-07T14:00:00.000Z",
  "tags": [
    "AI",
    "WebMCP",
    "llms.txt",
    "Astro",
    "Engineering Blog"
  ],
  "projectIds": [
    "tech-blog"
  ],
  "markdown": "Lately, across search, browsers, and developer tools, there are more and more moments where AI reads information from a website or acts on a user's behalf.\n\nAlong with that, terms like \"LLMO,\" \"AI search optimization,\" and \"AI agent support\" have started showing up everywhere.\n\nBut right now, I don't think it's settled what you should implement to be valued by AI, which specifications will see wide adoption, or whether there's even a shared, correct answer that deserves to be called \"optimization.\"\n\nSo when I added `llms.txt`, `llms-full.txt`, and WebMCP to this blog, it wasn't because I believed \"adding this makes you strong in AI search.\"\n\nIt was because **as the chance that non-humans read a website grows, I wanted to provide an entry point — separate from the HTML — that makes the content easy to hand to a machine.**\n\nThis post walks through what I implemented, and why I tried these mechanisms first, even though their effect and their future are far from decided.\n\n## The human-facing screen and the content itself can be considered separately\n\nA normal web page is built on the assumption that a human reads it in a browser.\n\nBeyond the article body, it includes a header, navigation, a table of contents, share buttons, related articles, a profile, a footer, and many other elements.\n\nThose are UI a human needs, but for a program or AI that only wants the article's content, not all of them are necessarily required.\n\nOf course, today's AI can extract the body from HTML. So it isn't that the content can't be read unless I provide a Markdown version.\n\nStill, handing over Markdown that already contains the title, metadata, and body from the start is simpler than making something parse the HTML and guess \"where does the body begin and end.\"\n\nThis blog's articles are managed in Markdown to begin with.\n\nSo generating HTML for humans while also returning Markdown or JSON close to the original structure when needed was relatively natural to implement.\n\nRather than rebuilding the human-facing UI into something for AI, it's closer to the idea of **providing multiple representations of the same content.**\n\n## What I implemented this time\n\nThe main additions were these three.\n\n* `llms.txt`, a guide to the whole site\n* `llms-full.txt`, which bundles every article\n* A WebMCP tool that returns the Markdown of the article currently open\n\nEach plays a slightly different role.\n\n## llms.txt is less an AI-only sitemap than a \"guide board\"\n\nOn this blog, I placed [`/llms.txt`](/llms.txt) at the root.\n\nIn addition to the site name and description, it includes the following.\n\n* A list of Japanese articles\n* A list of English articles\n* The Markdown-version URL of each article\n* Main pages such as Home, About, Works, and Archive\n* The Japanese and English RSS feeds\n* A link to `llms-full.txt`\n\nThe mental image is less a sitemap that mechanically lists every URL on the site, and more a **guide board that shows what exists here and which information is a good place to start.**\n\n```txt\n# Tomokichi's Engineer Growth Log\n\n> A personal engineering blog for logging learning, building, failures, and improvements.\n\n## Articles (Japanese)\n\n- [Article title](https://example.com/articles/example.md): Article description\n\n## Pages\n\n- [About](https://example.com/about/): Author profile and focus areas.\n```\n\nIf I updated the article list by hand, I might forget to update `llms.txt` every time I publish.\n\nSo I fetch published articles from Astro's Content Collections and generate the list at build time.\n\n```ts\nconst articles = await getArticlesWithPaths(locale);\n\nreturn articles\n  .map(\n    ({ article, slug }) =>\n      `- [${article.data.title}](${articleMarkdownUrl(locale, slug)}): ${article.data.description}`,\n  )\n  .join(\"\\n\");\n```\n\nThe article titles and descriptions reference the same frontmatter as the normal article pages.\n\nThat way, instead of separately maintaining information just for AI, it's **generated from the same source as the human-facing pages.**\n\nThat said, `llms.txt` is one of the formats currently being proposed, and it doesn't guarantee that every AI service will read it.\n\nAt least for now, rather than \"a file that increases AI traffic just by placing it,\" I think of it as an optional entry point a machine can use when it wants to grasp the site's structure.\n\n## llms-full.txt bundles the Markdown of published articles\n\n[`/llms-full.txt`](/llms-full.txt) concatenates the published Japanese and English articles as raw Markdown.\n\n```ts\nconst body = articles\n  .map(({ article, slug }) => buildArticleMarkdown(article, locale, slug).trim())\n  .join(\"\\n\\n---\\n\\n\");\n```\n\nIf `llms.txt` is an index of links to articles, then `llms-full.txt` is the file that bundles the bodies as well.\n\nThis format isn't strictly necessary.\n\nIn fact, as the number of articles grows, the file could become too large to handle comfortably. In the future, splitting it by category or language might be better.\n\nEven so, at the current stage where there aren't many articles yet, it's usable when you want to check the whole site's content at once.\n\nHere too, the goal isn't to raise search rankings, but to **add one more form that's easy for whoever needs it to handle.**\n\n## Each article is also available as Markdown and JSON\n\nSo that `llms.txt` can reference them, each article has a Markdown version separate from the HTML one.\n\nFor example, if a normal article URL looks like this,\n\n```txt\n/articles/example/\n```\n\nthe Markdown version can be retrieved at this URL.\n\n```txt\n/articles/example.md\n```\n\nThe article page's `head` also lists the Markdown and JSON versions as alternate representations.\n\n```html\n<link rel=\"alternate\" type=\"text/markdown\" href=\"...\" />\n<link rel=\"alternate\" type=\"application/json\" href=\"...\" />\n```\n\nThis isn't a feature only for AI.\n\nIt's also useful for people who want to copy the Markdown directly, who want to load it into another tool, or for programs that want to handle the metadata as JSON.\n\nRather than closing it off as an \"AI feature,\" I think it's more natural to see it as **a general output format that makes content easier to reuse.**\n\n## With WebMCP, I exposed only a tool to fetch the current article\n\nWebMCP is a draft specification for a web page to expose structured tools in the browser so that a supporting AI agent can call them.\n\nOn this blog, when an article page opens, I register a tool called `get_current_article_markdown`.\n\nIts only role is to return the Markdown of the article currently displayed.\n\nSimplified, the implementation looks like this.\n\n```ts\nconst modelContext = document.modelContext;\n\nmodelContext.registerTool({\n  name: \"get_current_article_markdown\",\n  description: \"Return the full Markdown source of the article currently open in this tab.\",\n  inputSchema: {\n    type: \"object\",\n    properties: {},\n    additionalProperties: false,\n  },\n  annotations: {\n    readOnlyHint: true,\n  },\n  async execute() {\n    const response = await fetch(markdownUrl);\n    const text = await response.text();\n\n    return {\n      content: [{ type: \"text\", text }],\n    };\n  },\n});\n```\n\nExposing multiple tools for search, posting, or editing is conceivable, but this time I didn't go that far.\n\nFor now, I implemented only a low-impact tool that:\n\n* requires no input\n* changes no state on the server\n* handles no personal information\n* only reads an already-published article\n\nI also set `readOnlyHint: true` to indicate that this tool is read-only.\n\nAnd in browsers where the WebMCP API doesn't exist, it simply does nothing and exits.\n\n```ts\nif (!modelContext || !root) return;\n```\n\nSo even in unsupported environments, normal article reading is unaffected.\n\nTo keep an old article's tool from lingering across Astro's page transitions, I also deregister it with an `AbortController` before leaving the page, and register it again on the next page.\n\nIn this way, I don't make WebMCP a premise of the site, but treat it as **a progressive feature that's added only in environments where it can be used.**\n\n## Why implement it when adoption isn't decided yet\n\nHonestly, I don't know how far `llms.txt` or WebMCP will spread from here.\n\nThe specifications might change, and another approach might become the mainstream.\n\nEven so, there were roughly four reasons I implemented them.\n\n### 1. Because web users aren't necessarily only humans anymore\n\nIn web development so far, it was mostly enough to think about humans viewing pages in a browser.\n\nBut now, beyond search engines, AI assistants, browser agents, and developer tools also retrieve information on the web more and more.\n\nI can't assert what the future shape will be, but I thought it was worth considering whether it's enough for a website to offer only a human-facing screen.\n\n### 2. Because the source data is Markdown, so I could try it at low cost\n\nThis blog is structured to convert Markdown into HTML with Astro.\n\nSo there was no need to rebuild the content from scratch for machines.\n\nFrom the same Markdown I can generate HTML, Markdown, JSON, and `llms.txt`, and WebMCP can return that Markdown too.\n\nBeing able to try this using the existing content model, without adding a large system, suited a personal blog well.\n\n### 3. Because I'd rather implement and observe than predict the effect\n\nWhen thinking about a new technology, one option is to respond after it becomes popular.\n\nOn the other hand, if it can be implemented in a small way, there's often more to learn from actually touching it.\n\n* What kind of information should be structured\n* How to avoid duplicating information with human-facing pages\n* Whether the design can stay easy to follow as the spec changes\n* How to separate read-only from state-changing operations\n\nThese are easier to think about concretely once you've implemented it than by only reading about it.\n\nThis effort is less a bet on the future than **a small experiment for understanding the change.**\n\n### 4. Because I could make it easy to remove later\n\nWhen trying a new specification, I think ease of removal matters as much as ease of adoption.\n\nThis implementation doesn't replace the existing article display or URL structure.\n\n`llms.txt` and `llms-full.txt` are additional outputs, and WebMCP only runs when the API exists.\n\nEven if it ends up unused, the impact on normal blog functionality is small.\n\nPrecisely because it's a technology whose adoption I can't read, I tried it as a loosely coupled add-on rather than placing it at the center.\n\n## This isn't a story about \"LLMO is now handled\"\n\nCalling this implementation \"LLMO measures\" or \"AI search optimization\" feels slightly off to me.\n\nThe very scope that \"LLMO\" refers to isn't fixed, and how AI services discover, retrieve, and use a website in their answers differs from service to service.\n\nPlacing `llms.txt` doesn't guarantee that AI will cite you in its answers.\n\nImplementing WebMCP doesn't mean common browsers or AI agents can use it right away either.\n\nAt the moment, neither promises a widely established result.\n\nSo on this blog, rather than \"optimized for AI,\" I think it's more accurate to say\n\n**I experimentally prepared a machine-readable path that AI or programs can use when retrieving the content.**\n\n## llms.txt is not a substitute for robots.txt or a usage license\n\nAnother thing I want to be careful about is that `llms.txt` is not an access-control mechanism.\n\nIt differs in role both from something like `robots.txt` that tells crawlers an access policy, and from a license that defines content copyright or terms for AI-training use.\n\nThis blog's `llms.txt` is there to guide what the site contains and which formats are easy to retrieve.\n\nQuestions of what may be crawled and how articles may be used need to be considered separately through robots.txt, terms of use, licenses, and each service's behavior.\n\nBy its name alone it might look like a file that states every AI-related policy, but at least in this implementation I treat it as **a guide to already-published information, not something that grants permissions.**\n\n## If WebMCP changes state, the story changes a lot\n\nThe WebMCP tool I exposed this time only returns an article's Markdown, so what it can do is limited.\n\nBut if WebMCP handles form submission, purchases, reservations, posting, or settings changes, simply registering a tool is not enough.\n\nJust like a normal web UI, or even more so, you need to think about:\n\n* authentication and authorization\n* input validation\n* protection against CSRF and malicious requests\n* confirmation for important operations\n* preventing double execution\n* rate limiting\n* operation logging\n* keeping the tool's description and its actual behavior in agreement\n\nJust because an AI agent is the caller doesn't make it a trusted client.\n\nThis time I limited it to a feature that only reads public content. Even if I add tools later, I want to avoid exposing state changes just because they seem convenient, and instead consider the boundaries the same way I would in normal API design.\n\n## What I want to check going forward\n\nAt the point of implementing it, this isn't \"done\" yet.\n\nGoing forward, I want to check things like:\n\n* whether access to `llms.txt` or the Markdown versions actually happens\n* whether `llms-full.txt` becomes too large as the number of articles grows\n* whether I can follow WebMCP spec changes without strain\n* how far support spreads on the browser and agent side\n* whether the machine-facing explanations contradict the human-facing ones\n\nThat said, even if there are records in the access logs, I can't necessarily tell precisely whether the use is by AI, or just a check or a crawler.\n\nEven if I can get numbers, I think I need to observe over the long term rather than immediately tying them to an effect.\n\n## Closing\n\nThis time, I implemented `llms.txt`, `llms-full.txt`, and WebMCP on this engineering blog.\n\nBut this isn't an assertion that \"from now on, this approach will be the correct answer.\"\n\nThe shape of AI search and an agent-friendly web is still in the middle of changing.\n\nThat's exactly why, instead of deciding the future and building heavily, I added an entry point that makes information easier to hand to machines as well, within a range that doesn't break the existing human-facing experience.\n\nSome people read HTML.\nSome people want to use Markdown directly.\nSometimes a program retrieves JSON.\nAnd in the future, an AI agent in the browser might use WebMCP's tools.\n\nI can't predict all of that from now.\n\nEven so, I think a design that separates content from presentation and can safely offer the same information in multiple forms has meaning regardless of the AI trend.\n\nThis implementation isn't a tactic for conquering an unknown ranking.\n\n**It's a small experiment for thinking about a web where humans aren't necessarily the only users.**\n\n## References\n\n* [The /llms.txt file](https://llmstxt.org/)\n* [WebMCP Draft Community Group Report](https://webmachinelearning.github.io/webmcp/)\n* [WebMCP is available for early preview](https://developer.chrome.com/blog/webmcp-epp)"
}
