Skip to content

AI discovery and AEO

What this is

How ComStack sites get found, cited, and used by AI assistants. The publishing pipeline generates structured content and machine-readable endpoints so LLMs can index, cite, and query your site without manual configuration.

How it works

Structured data (JSON-LD)

Every page emits JSON-LD automatically based on its template and metadata:

  • FAQPage — pages with is_faq: true emit FAQPage structured data. Question.name comes from metadata.title; Answer.text from metadata.description. This is the highest-value signal for AI answer surfaces — each FAQ page is a single, citable Q&A unit.
  • BreadcrumbList — every page emits Home → [Section →] Page breadcrumbs for navigation context.
  • SpeakableSpecification — marks the meta description and first article paragraph as voice-readable.
  • Organization + WebSite — emitted on the locale homepage from branding settings.
  • Template-driven — templates with a schema_org_type (e.g. RealEstateListing) automatically emit that type’s JSON-LD.

llms.txt

A machine-readable index following the llmstxt.org spec. Generated at publish time from your published content.

The file starts with a project name heading and description (the preamble, set in project settings → settings/llmstxt.content), followed by a flat list of FAQ entries sorted by sidebar order, then a documentation catalog of all public pages. Example structure:

[Site name heading]
> Your one-line description
## Frequently Asked Questions
- [Do you need a NIE?](/faq/nie-spain) — Yes. A NIE is mandatory for any property purchase.
## Documentation
- [Getting started](/get-started) — Set up your project in 30 minutes.

FAQ pages appear first — highest-density answer content. /llms-full.txt delivers full page bodies for RAG ingestion.

Sitemap and robots.txt

When site access is set to Public:

  • robots.txt explicitly allows major AI crawlers: GPTBot, ClaudeBot, PerplexityBot, Google-Extended, and others
  • The sitemap includes <xhtml:link> hreflang alternates for translated pages
  • Fallback i18n URLs (e.g. /es/en-slug) are excluded and tagged noindex to prevent duplicate indexing

MCP server as a tool surface

Your project is exposed as a tool AI assistants can query directly via the MCP server. An AI can call get-page-content, search-docs, or list-pages to retrieve structured content live — not from a crawl cache.

This makes your site actionable: an AI can answer a specific question by querying the exact page in real time.

hreflang for multilingual sites

Translated slugs (e.g. /guides/buying/es/guias/comprar) require correct hreflang that can’t be inferred from URL structure. The publishing pipeline injects correct <xhtml:link rel="alternate" hreflang> tags in the sitemap and page <head>, using the actual translated URL.

No auto-redirect by locale: pages never serve different content based on Accept-Language or IP. AI crawlers — which send no Accept-Language or a hardcoded en-US default — always see the URL’s actual locale content. This ensures all locale variants stay correctly indexed.

When to use it

These features activate automatically when site access is Public. You can influence specific signals:

SignalHow to control
FAQ ranking in llms.txtSet metadata.sidebar.order on is_faq: true pages; lower numbers appear first
llms.txt preambleSet settings/llmstxt.content — write the site name, description, and context AI agents should understand about your business
Sitemap prioritySet sitemap_priority (0.0–1.0) and sitemap_changefreq on individual pages
Per-page social cardSet metadata.og_image for an OpenGraph image (falls back to the project default)

Common errors

ErrorCauseFix
FAQ not in llms.txtPage’s is_faq is false, or site access is not PublicSet is_faq: true on the page; verify site access in settings
AI crawlers blockedSite access is not PublicWhen access is Unlisted, Members, Agents, or Disabled, robots.txt blocks all crawlers
Wrong URL in hreflangSlug changed after translation was createdRe-publish — sitemap and hreflang tags are regenerated on every publish
JSON-LD missing for a templateTemplate has no schema_org_typeSpeakableSpecification and BreadcrumbList are always emitted; template-driven JSON-LD requires a declared schema_org_type

Last updated: