AI discovery and AEO

What this is

How ComStack sites get found, cited, and used by AI assistants. The publishing pipeline generates structured content and machine-readable endpoints so LLMs can index, cite, and query your site without manual configuration.

How it works

Structured data (JSON-LD)

Every page emits JSON-LD automatically based on its template and metadata:

FAQPage — pages with is_faq: true emit FAQPage structured data. Question.name comes from metadata.title; Answer.text from metadata.description. This is the highest-value signal for AI answer surfaces — each FAQ page is a single, citable Q&A unit.
BreadcrumbList — every page emits Home → [Section →] Page breadcrumbs for navigation context.
SpeakableSpecification — marks the meta description and first article paragraph as voice-readable.
Organization + WebSite — emitted on the locale homepage from branding settings.
Template-driven — templates with a schema_org_type (e.g. RealEstateListing) automatically emit that type’s JSON-LD.

llms.txt

A machine-readable index following the llmstxt.org spec. Generated at publish time from your published content.

The file starts with a project name heading and description (the preamble, set in project settings → settings/llmstxt.content), followed by a flat list of FAQ entries sorted by sidebar order, then a documentation catalog of all public pages. Example structure:

[Site name heading]
> Your one-line description

## Frequently Asked Questions
- [Do you need a NIE?](/faq/nie-spain) — Yes. A NIE is mandatory for any property purchase.

## Documentation
- [Getting started](/get-started) — Set up your project in 30 minutes.

FAQ pages appear first — highest-density answer content. /llms-full.txt delivers full page bodies for RAG ingestion.

Sitemap and robots.txt

When site access is set to Public:

robots.txt explicitly allows major AI crawlers: GPTBot, ClaudeBot, PerplexityBot, Google-Extended, and others
The sitemap includes <xhtml:link> hreflang alternates for translated pages
Fallback i18n URLs (e.g. /es/en-slug) are excluded and tagged noindex to prevent duplicate indexing

MCP server as a tool surface

Your project is exposed as a tool AI assistants can query directly via the MCP server. An AI can call get-page-content, search-docs, or list-pages to retrieve structured content live — not from a crawl cache.

This makes your site actionable: an AI can answer a specific question by querying the exact page in real time.

hreflang for multilingual sites

Translated slugs (e.g. /guides/buying → /es/guias/comprar) require correct hreflang that can’t be inferred from URL structure. The publishing pipeline injects correct <xhtml:link rel="alternate" hreflang> tags in the sitemap and page <head>, using the actual translated URL.

No auto-redirect by locale: pages never serve different content based on Accept-Language or IP. AI crawlers — which send no Accept-Language or a hardcoded en-US default — always see the URL’s actual locale content. This ensures all locale variants stay correctly indexed.

When to use it

These features activate automatically when site access is Public. You can influence specific signals:

Signal	How to control
FAQ ranking in llms.txt	Set `metadata.sidebar.order` on `is_faq: true` pages; lower numbers appear first
llms.txt preamble	Set `settings/llmstxt.content` — write the site name, description, and context AI agents should understand about your business
Sitemap priority	Set `sitemap_priority` (0.0–1.0) and `sitemap_changefreq` on individual pages
Per-page social card	Set `metadata.og_image` for an OpenGraph image (falls back to the project default)

Common errors

Error	Cause	Fix
FAQ not in llms.txt	Page’s `is_faq` is `false`, or site access is not `Public`	Set `is_faq: true` on the page; verify site access in settings
AI crawlers blocked	Site access is not `Public`	When access is `Unlisted`, `Members`, `Agents`, or `Disabled`, `robots.txt` blocks all crawlers
Wrong URL in hreflang	Slug changed after translation was created	Re-publish — sitemap and hreflang tags are regenerated on every publish
JSON-LD missing for a template	Template has no `schema_org_type`	`SpeakableSpecification` and `BreadcrumbList` are always emitted; template-driven JSON-LD requires a declared `schema_org_type`

Translations concept — how multilingual variants are generated
Languages reference — BCP-47 codes and URL routing
Solution: ChatGPT doesn’t recommend my business