AI discovery and AEO
What this is
How ComStack sites get found, cited, and used by AI assistants. The publishing pipeline generates structured content and machine-readable endpoints so LLMs can index, cite, and query your site without manual configuration.
How it works
Structured data (JSON-LD)
Every page emits JSON-LD automatically based on its template and metadata:
- FAQPage — pages with
is_faq: trueemitFAQPagestructured data.Question.namecomes frommetadata.title;Answer.textfrommetadata.description. This is the highest-value signal for AI answer surfaces — each FAQ page is a single, citable Q&A unit. - BreadcrumbList — every page emits
Home → [Section →] Pagebreadcrumbs for navigation context. - SpeakableSpecification — marks the meta description and first article paragraph as voice-readable.
- Organization + WebSite — emitted on the locale homepage from branding settings.
- Template-driven — templates with a
schema_org_type(e.g.RealEstateListing) automatically emit that type’s JSON-LD.
llms.txt
A machine-readable index following the llmstxt.org spec. Generated at publish time from your published content.
The file starts with a project name heading and description (the preamble, set in project settings → settings/llmstxt.content), followed by a flat list of FAQ entries sorted by sidebar order, then a documentation catalog of all public pages. Example structure:
[Site name heading]> Your one-line description
## Frequently Asked Questions- [Do you need a NIE?](/faq/nie-spain) — Yes. A NIE is mandatory for any property purchase.
## Documentation- [Getting started](/get-started) — Set up your project in 30 minutes.FAQ pages appear first — highest-density answer content. /llms-full.txt delivers full page bodies for RAG ingestion.
Sitemap and robots.txt
When site access is set to Public:
robots.txtexplicitly allows major AI crawlers:GPTBot,ClaudeBot,PerplexityBot,Google-Extended, and others- The sitemap includes
<xhtml:link>hreflang alternates for translated pages - Fallback i18n URLs (e.g.
/es/en-slug) are excluded and taggednoindexto prevent duplicate indexing
MCP server as a tool surface
Your project is exposed as a tool AI assistants can query directly via the MCP server. An AI can call get-page-content, search-docs, or list-pages to retrieve structured content live — not from a crawl cache.
This makes your site actionable: an AI can answer a specific question by querying the exact page in real time.
hreflang for multilingual sites
Translated slugs (e.g. /guides/buying → /es/guias/comprar) require correct hreflang that can’t be inferred from URL structure. The publishing pipeline injects correct <xhtml:link rel="alternate" hreflang> tags in the sitemap and page <head>, using the actual translated URL.
No auto-redirect by locale: pages never serve different content based on Accept-Language or IP. AI crawlers — which send no Accept-Language or a hardcoded en-US default — always see the URL’s actual locale content. This ensures all locale variants stay correctly indexed.
When to use it
These features activate automatically when site access is Public. You can influence specific signals:
| Signal | How to control |
|---|---|
| FAQ ranking in llms.txt | Set metadata.sidebar.order on is_faq: true pages; lower numbers appear first |
| llms.txt preamble | Set settings/llmstxt.content — write the site name, description, and context AI agents should understand about your business |
| Sitemap priority | Set sitemap_priority (0.0–1.0) and sitemap_changefreq on individual pages |
| Per-page social card | Set metadata.og_image for an OpenGraph image (falls back to the project default) |
Common errors
| Error | Cause | Fix |
|---|---|---|
| FAQ not in llms.txt | Page’s is_faq is false, or site access is not Public | Set is_faq: true on the page; verify site access in settings |
| AI crawlers blocked | Site access is not Public | When access is Unlisted, Members, Agents, or Disabled, robots.txt blocks all crawlers |
| Wrong URL in hreflang | Slug changed after translation was created | Re-publish — sitemap and hreflang tags are regenerated on every publish |
| JSON-LD missing for a template | Template has no schema_org_type | SpeakableSpecification and BreadcrumbList are always emitted; template-driven JSON-LD requires a declared schema_org_type |
Related
- Translations concept — how multilingual variants are generated
- Languages reference — BCP-47 codes and URL routing
- Solution: ChatGPT doesn’t recommend my business