The complete guide to GEO & AEO.
How to make your website citable by ChatGPT, Claude, Perplexity, Gemini, and Google AI Overviews — built on the peer-reviewed Princeton KDD 2024 paper. Eleven chapters. Eight thousand five hundred words. One source of truth.
AI search is the new search. 25.11% of Google queries trigger AI Overviews; 87% of all AI referral traffic flows through ChatGPT alone; AI traffic converts at 5× the rate of Google organic. If your site isn't cited by AI engines, you're invisible to a large and growing share of high-intent users.
The good news: it's measurable and fixable. The peer-reviewed Princeton KDD 2024 paper tested 9 optimization tactics across 10,000 generative engine queries. Three dominated: emphasize sources (+115%), add expert quotes (+41%), add statistics (+40%). Together they more than double citation likelihood.
This guide teaches the rest. Eleven chapters covering technical foundations (robots.txt, AI bots, schema), content tactics, entity signals, per-engine optimization (ChatGPT, Claude, Perplexity, Gemini, AI Overviews), measurement, common mistakes, and a week-by-week 30-day action plan. Run a free GEO audit on your site to see where you stand before you start.
Chapter 1The Stakes — Why AI search matters now
For the first thirty years of the web, search meant Google. You typed a query, saw a list of ten blue links, picked one, clicked through. The site that ranked higher got more traffic. Optimize for keywords, build backlinks, ship a sitemap — that was SEO.
That model is breaking. The breaking happened slowly between 2022 (ChatGPT's launch) and 2025, then accelerated through 2026. Three numbers tell the story:
Together these say: AI search is no longer experimental. It's where a large and growing share of high-intent traffic comes from, with 5× better conversion than the source it's replacing. The companies that get cited by AI engines win the next decade.
What "being cited" actually means
When a user types a question into ChatGPT or Perplexity or Google's AI Overviews, the engine doesn't return ten links. It returns one synthesized answer, drawn from a set of cited sources. Citation looks like this:
"GEO market is projected to reach $33.7B by 2034, growing at a CAGR of 50.5% [1]. The largest single contributor is enterprise content marketing, where AI Overview citation has become the primary source of brand visibility [2]." Hypothetical Perplexity response · sources [1] [2] embedded
Those [1] and [2] are clickable. They link back to the source. Being a source means being the link.
Across all five major engines (ChatGPT, Claude, Perplexity, Gemini, Google AI Overviews), a citation in the answer is worth roughly what a #1 organic search result was in 2015 — the highest-trust, highest-conversion form of inbound visibility available. Yet most websites are not optimized to be cited. They're optimized to rank in lists of links, which is a different problem.
The shift from ranking to citation
Traditional SEO optimizes for ranking. The question is: among ten possible links, will yours appear higher? GEO optimizes for citation. The question is: among the entire indexable web, will the AI engine choose your content as a source for its synthesized answer?
These are different problems with different mechanics. Ranking rewards backlinks, domain authority, keyword targeting. Citation rewards citability — content that is structurally easy to lift, attribute, and quote. The Princeton paper formalized this distinction. The rest of this guide teaches it.
Chapter 2Definitions — GEO, AEO, and how they relate to SEO
Three acronyms. Easy to confuse. Worth getting right.
SEO — Search Engine Optimization
SEO is what we've done since the late 1990s. It's optimization for traditional, list-of-links search engines: Google, Bing, DuckDuckGo, Yandex, Baidu. The goal is to rank — to appear higher in the list of ten or twenty results returned for a query. Tactics include keyword targeting, backlink building, technical site health, page speed, mobile-friendliness, and structured data.
AEO — Answer Engine Optimization
AEO targets answer surfaces — places where the engine extracts and presents a direct answer to the user's question, not a list of links. Featured snippets in Google. Voice assistants like Alexa and Siri. Google's AI Overviews. Bing's instant answers. AEO emphasizes direct-answer formatting: clear questions and answers, FAQPage schema, concise factual statements, and structured data that makes content extractable.
GEO — Generative Engine Optimization
GEO targets generative AI search — engines that synthesize a response rather than extract a single answer. ChatGPT. Claude. Perplexity. Google Gemini. Generative engines combine information from multiple sources into a fluent, multi-paragraph response, with citations linking back to the sources. GEO emphasizes citability — content that is quotable, attributable, and likely to be selected as one of those sources.
How they relate
Imagine a Venn diagram with three overlapping circles. The overlap is large.
- Most technical tactics overlap all three: HTTPS, mobile-friendliness, fast page load, valid HTML, accessible content. SEO requires these. AEO requires these. GEO requires these.
- Schema markup matters for all three but with different priorities. SEO benefits from any valid markup. AEO especially rewards FAQPage and HowTo. GEO especially rewards Article (with author markup) and Speakable.
- Backlinks matter for SEO heavily, AEO moderately, GEO surprisingly little compared to content quality signals.
- Content tactics diverge most. SEO rewards keyword density and topic depth. AEO rewards direct-answer formatting. GEO rewards citability — emphasized sources, expert quotes, statistics density.
For most sites, all three matter. The Princeton paper showed that some GEO tactics (source emphasis, expert quotes) are also good SEO tactics — they signal authority and trust regardless of which engine type is consuming the page. That's lucky: doing GEO well typically improves SEO and AEO too.
For more, see our GEO vs AEO vs SEO disambiguation page.
Chapter 3How AI engines decide what to cite
Generative engines don't browse the web in real time for every query (mostly). They've already crawled, indexed, and partially summarized the web. When you ask a question, the engine performs three steps:
- Retrieval. Find candidate sources from the index that are relevant to the query. This is similar to traditional search ranking — relevance + authority signals decide which 10-50 sources are candidates.
- Synthesis. Read the candidate sources and write a single coherent answer. This is where the LLM does the work. It draws information from multiple sources, paraphrases, combines, and produces a response.
- Attribution. Decide which sources to cite. The engine surfaces a subset of the candidates as the citations shown to the user.
GEO is mostly about steps 2 and 3. Step 1 is largely the same as SEO: be indexable, be relevant, be authoritative. Steps 2 and 3 are where citability comes in.
Why some sources get cited and others don't
The Princeton 2024 paper investigated this question empirically. The team built GEO-bench, a benchmark of 10,000 queries, and tested 9 candidate "optimization tactics" — content modifications applied to source material before re-running each query. They measured which tactics caused the modified source to receive more visibility in the synthesized response.
The findings were sometimes intuitive (citing your sources helps) and sometimes counterintuitive (keyword stuffing actively hurts):
| # | Tactic | Visibility impact |
|---|---|---|
| 1 | Source emphasis | +115% |
| 2 | Expert quotes | +41% |
| 3 | Statistics | +40% |
| 4 | Inline citations | +30% |
| 5 | Authority signaling | +25% |
| 6 | Improved fluency | +15% |
| 7 | Easy-to-read | +12% |
| 8 | Topic relevance | +10% |
| 9 | Keyword stuffing | −22% |
The headline finding: source emphasis — the simple act of explicitly bolding, citing, and attributing your sources — increases citation likelihood by +115%. This is the strongest single tactic. It costs nothing beyond formatting.
For a deeper read on the methodology, see our research foundation page.
Chapter 4Technical foundations — can AI bots reach your content?
The cheapest, most foundational fix is also the most overlooked: are you allowing AI bots to crawl your site?
The Princeton paper makes this explicit:
If crawlers can't find and parse your content, prose optimization doesn't matter.Aggarwal et al., 2024
The 27 AI bots you should explicitly allow
As of 2026, there are 27 distinct AI crawler user-agents you should know about. Most are operated by major LLM providers; some are operated by emerging engines. We track them in best-aeo-skill:
User-agent: GPTBot # OpenAI training + ChatGPT browse User-agent: ChatGPT-User # ChatGPT real-time fetch User-agent: OAI-SearchBot # SearchGPT User-agent: ClaudeBot # Anthropic search/browse User-agent: anthropic-ai # Anthropic web crawler User-agent: Claude-Web # Claude.ai user fetches User-agent: Claude-User # Claude tool use User-agent: Claude-SearchBot # Claude search index User-agent: PerplexityBot # Perplexity index User-agent: Perplexity-User # Perplexity user fetches User-agent: Google-Extended # Bard/Gemini training User-agent: GoogleOther # AI Overviews + SGE User-agent: Applebot # Spotlight + Siri User-agent: Applebot-Extended # Apple Intelligence User-agent: FacebookBot # Meta AI User-agent: Meta-ExternalAgent # Meta tools User-agent: YouBot # You.com User-agent: cohere-ai # Cohere User-agent: MistralAI-User # Mistral fetch User-agent: CCBot # Common Crawl (LLM training) User-agent: Bytespider # ByteDance / Doubao User-agent: Diffbot # Diffbot Knowledge Graph User-agent: Amazonbot # Amazon AI User-agent: DuckDuckBot # DuckDuckGo User-agent: YandexBot # Yandex User-agent: Bingbot # Bing index (Copilot) User-agent: Googlebot # Standard Google search Allow: /
Block any of these and you cut yourself off from the corresponding engine. Some sites do this thinking they're "protecting their content." In practice, they're invisible to a major share of high-intent users who now ask AI engines instead of typing keywords.
Verify, don't trust
Listing a User-agent in your robots.txt is necessary but not sufficient. Two common failure modes:
- CDN bot management. Cloudflare, Akamai, and other CDNs run their own bot-detection rules independently of robots.txt. As of Q3 2024, Cloudflare added a dedicated "AI Bots" management category that some site owners enable by default — blocking even bots their robots.txt explicitly allows. Check Cloudflare → Security → Bots → AI bots.
- JavaScript rendering. AI bots have inconsistent JavaScript execution. A pure SPA (single-page app) that renders content client-side may appear empty to many bots. Use server-side rendering (SSR) or static generation for content-bearing pages.
The fix: test fetch-as-bot for the engines you care about. Curl with the User-agent header, render the response with a basic HTML parser, confirm your content is in the markup. best-aeo-skill's ai_bot_access evidence collector does this automatically across all 27 agents.
Sitemap, hreflang, and other classics
Standard SEO foundations remain important: a current XML sitemap, valid hreflang tags for multilingual sites, clean 200/301/404 status codes, mobile-friendliness, fast page load. AI engines reuse most of the indexing pipeline traditional search engines built. None of these tactics are GEO-specific, but they're prerequisites.
Chapter 5Content citability — the tactics that matter most
Once your site is technically accessible, content quality determines whether AI engines actually cite you. This is where the Princeton paper's strongest findings live.
Source emphasis (+115%)
The single highest-leverage tactic. Princeton found that pages which explicitly emphasize their sources — through inline citations, links to primary references, or visible attribution — are 2.15× more likely to be cited by generative engines than equivalent pages without emphasis.
Practical implementation:
- Every numeric claim should link to a primary source at the point of claim, not just at the bottom.
- Use bold or italics on cited entities ("according to Princeton's KDD 2024 paper...").
- Maintain a visible reference list at the bottom of long articles.
- For research-style content, use footnote-style numbered references like
[1],[2]. - Cross-link related articles internally — this signals topical authority.
Expert quotes (+41%)
Adding 2-4 attributed quotations per ~1000 words raises citation likelihood by 41%. AI engines treat quoted passages as anchor evidence when synthesizing responses — they're easier to lift verbatim into a synthesized answer.
Two requirements for quotes to count:
- Speaker name and credential. "According to Dr. Jane Smith, professor of computer science at Stanford, ..." — not "according to one expert" or "anonymous source." Anonymous attribution patterns reduce citation rate.
- Quotation marks. Use proper quotation marks or HTML
<blockquote>elements. Engines look for these signals.
Statistics density (+40%)
Pages with at least one numeric claim per 200 words receive 40% more citations than thin-statistic pages. The reason: AI engines often need a number to anchor a synthesized answer. Pages that supply the number with a source become natural citations.
Practical guideline: aim for one statistic per 200 words. For a 1500-word article, that's 7-8 numeric claims. For a 600-word post, three. Pair each statistic with a citation to its primary source — then you double-dip on tactic 1 (source emphasis) and tactic 3 (statistics).
Freshness
Empirical data from large-scale citation analysis shows content under 30 days old receives 3.2× more citations than equivalent older content. Three implications:
- Add a machine-readable
dateModifiedfield to your JSON-LD on every content page. - Update major articles every 30-90 days — actual updates, not just changing the date.
- For evergreen topics, schedule quarterly refreshes to keep the freshness signal alive.
Readability
Princeton found a Flesch-Kincaid grade level around 8-10 performs best for general AI search visibility. Higher (academic-grade text) reduces citations from general engines. Lower (oversimplified, listicle-style) reduces authority signaling.
Mix sentence lengths. Use short sentences (under 15 words) and medium sentences (15-25 words) in roughly equal proportion. Monotonic sentence length is a signal of AI-generated content, which engines increasingly de-cite.
Avoid AI-rewrite tells
Modern detection systems flag patterns common in machine-generated content. Avoid:
- "It's important to note" — overused by LLMs as a transition.
- "In conclusion" — formulaic; rewrite the closing paragraph.
- "Furthermore" — almost never needed in human writing.
- "Delve into" — heavily flagged.
- Every paragraph starting with the same connector ("Moreover... Additionally... Finally...").
Chapter 6Structured data — making your content parseable
Structured data (JSON-LD via Schema.org) is how you tell engines what your content is. AI search engines lean on it heavily — much more so than traditional Google ever did.
FAQPage — the highest-leverage single signal
Across all schema types, FAQPage produces the highest single-signal AI citation rate. Why: AI engines often surface answers to user questions by extracting Q&A pairs from FAQPage schema directly. If your page has FAQ markup, you become a candidate citation for every related question.
Best practices:
- 5-10 question-answer pairs per page is ideal. Below 3 reduces signal weight; above 15 dilutes.
- Each question must be a real question — phrased with a question mark.
- Each answer should be 30-200 words. Too short looks unauthoritative; too long becomes unextractable.
- Use real user questions (from search console, support tickets, sales calls) — not synthetic Q&A. AI engines detect mass-generated FAQ content and de-cite it.
<script type="application/ld+json">
{
"@context": "https://schema.org",
"@type": "FAQPage",
"mainEntity": [
{
"@type": "Question",
"name": "What is generative engine optimization?",
"acceptedAnswer": {
"@type": "Answer",
"text": "Generative engine optimization (GEO) is the practice of structuring web content to maximize..."
}
}
]
}
</script>
Article + author markup
Every content page should have Article (or BlogPosting or NewsArticle) markup with these required fields:
headline— page titledatePublished+dateModified— ISO-8601 timestampsauthor— Person markup with name, jobTitle, sameAs linkspublisher— Organization markup with logo
Anonymous authorship reduces citation rate by ~60%. If your articles don't have a named author with credentials, fix this first. The Person markup should include sameAs URLs that resolve — LinkedIn, Twitter/X, GitHub, ORCID for academics, Wikipedia if applicable.
Organization with sameAs
Your site's Organization markup should include sameAs URLs that link to authoritative entities for the same organization. Wikidata is the most achievable; Wikipedia is the most prestigious. Crunchbase, LinkedIn, and major industry registries also count. The more sameAs links, the stronger the Knowledge Graph trust signal.
HowTo, Speakable, Product, BreadcrumbList
Other schema types worth adding where applicable:
- HowTo — for tutorials and step-by-step guides. Required fields: name, step list with HowToStep entities, optional images per step.
- Speakable — Speakable Specification markup tells voice-AI surfaces (Google Assistant, Alexa) which sentences/headings to read aloud. Critical for AI Overview voice variants.
- Product +
AggregateRating+ offers — for e-commerce. Without these, you can't be cited in AI buying-guide responses. - BreadcrumbList — on every non-home page. Improves contextual citation by clarifying page hierarchy.
Always validate before deploy. Schema.org's validator and Google's Rich Results test catch most errors. best-aeo-skill's schema_validate evidence collector runs both.
Chapter 7Entity and brand signals — do you exist as an authority?
Sustained citation requires entity presence. A page can have perfect technical setup, perfect schema, perfect content — and still get cited rarely if the underlying organization or author is unknown to the AI engines' Knowledge Graph.
Author bios with credentials
Author bios should include:
- Full name
- Role and organization
- Years of experience
- 2+ credentials (PhD, MD, certifications, named publications, awards)
- Links to verified profiles (LinkedIn, ORCID, Twitter)
- A photo (real, not stock)
Author profile pages with full Person schema increase article citation rate by ~20%.
Wikidata before Wikipedia
A Wikipedia entry is hard to get and easy to lose. A Wikidata entity (a Q-number) is much more achievable and provides similar trust signal value. Wikidata lists are open to community editing with relaxed notability requirements; if you have any verifiable third-party coverage, you can be in Wikidata.
Once your organization has a Q-number, link to it from your Organization schema's sameAs field. This creates a verifiable Knowledge Graph trust path that AI engines can follow.
NAP consistency for local
For local businesses, Name / Address / Phone (NAP) consistency across surfaces is non-negotiable:
- Website footer
- Google Business Profile (GBP)
- Apple Maps profile
- Yelp, Tripadvisor, industry directories
- LocalBusiness JSON-LD on the website
Inconsistent NAP reduces local AI Overview citation by ~40%. The fix is mechanical: write down the canonical NAP once, audit every surface, fix mismatches.
Brand mentions and external signals
AI engines can detect when a brand is widely mentioned across the web — through citations, social, news coverage, podcast guest appearances, conference speaker activity. These are slow-moving signals; you can't fake them. They build over years through consistent activity.
Two practical accelerators:
- Original research / annual reports. A recurring data study (e.g., "State of X 2026") that other publications cite is the highest-leverage single move for entity authority.
- Podcast appearances. Each verified guest spot adds an entity signal. Document them with sameAs links from your Person schema to the episode pages.
Chapter 8Multi-engine optimization — each AI search surface is different
Different engines weight signals differently. Optimizing only for ChatGPT is leaving Perplexity, Claude, and AI Overviews on the table. Here's how each major engine prefers to be served:
ChatGPT
- ~87% of all AI referral traffic, by far the largest. Prioritize this.
- Favors authoritative long-form (1500+ words), consensus-based content, with explicit attribution.
- Prefers content with clear "this is the answer" framing. Hedged content ("it depends," "may," "might") gets cited less.
- Surfaces source links in the response when browsing is enabled (most queries in 2026).
Claude (Anthropic)
- Rewards precise attribution and high factual density.
- Most likely of the major engines to cite multiple sources for a single claim — making your page useful even if you're one of three sources, not the only one.
- Tends to prefer sources with active
dateModifiedwithin 90 days. - Officially supports
llms.txtfor ClaudeBot — though only a small share of crawls use it.
Perplexity
- Favors academic and news sources, heavy citation density, fresh content.
- Citation extraction prefers explicit numbered references (
[1][2][3]) or footnote-style attribution. - Best engine to optimize for if you write data-rich, citation-heavy content.
Google AI Overviews
- Triggers on ~25.11% of all Google searches in 2026 (up from 13.14% in March 2025). For local queries, the rate is ~38%.
- Leans on traditional SEO best practices PLUS direct-answer formatting.
- Rewards FAQPage schema heavily. If your AI Overview presence is low, add FAQ schema first.
- Speakable schema increases voice-variant inclusion.
Gemini
- Blends traditional Google ranking signals with AI-specific signals.
- Optimizing for both Google search AND AI Overviews tends to lift Gemini citation rates as a side effect.
- Less standalone optimization needed than ChatGPT/Perplexity.
The practical takeaway: optimize broadly. Most signals overlap. The specific differences between engines matter most for sites doing serious volume — for early-stage optimization, get the basics right across the board, then differentiate.
Chapter 9Measurement — tracking GEO performance over time
You can't improve what you can't measure. Three layers of measurement, ordered by sophistication:
Level 1: Composite GEO Score (weekly)
Run an audit (we recommend our free tool) on your top 10-20 pages weekly. Track the composite score over time. The goal isn't to hit 100 — it's to detect regressions.
A score that drifts down 5+ points in a week is a signal something broke: your CDN started blocking GPTBot, a deploy stripped your JSON-LD, a content update removed expert quotes. Catch it within a week instead of finding out months later when traffic is gone.
Level 2: AI traffic attribution (monthly)
In your analytics platform, segment traffic by AI referrer:
chat.openai.comperplexity.aiclaude.aigemini.google.com- Google with parameter
udm=14(AI mode)
Track sessions, conversions, and conversion rate per AI source. Compare against organic search and direct traffic. If your AI traffic is converting at significantly better rates (it usually does — typically 3-5×), invest more.
Level 3: Brand mention tracking (quarterly)
Tools like OtterlyAI and Brand24 monitor mentions of your brand in AI search responses. They track which queries trigger your brand to appear, in which engines, with what context. This is the most expensive layer but the most strategic — it tells you where you're winning or losing share-of-voice in AI search.
Content decay detection
Articles older than 90 days with declining citation rates are decaying. Don't ignore them. Refresh signals:
- Add new statistics from the past 90 days.
- Update
dateModifiedin JSON-LD. - Add new expert quotes.
- Re-link to current primary sources.
- Don't just change the date — actually update the content.
best-aeo-skill's monitor sub-skill flags articles showing decay automatically.
Chapter 10Pitfalls — 10 anti-patterns that destroy citation rates
- Keyword stuffing. The Princeton paper measured a -22% effect. AI engines penalize stuffing more aggressively than Google does.
- Synthetic FAQ schema. Mass-generated Q&A that doesn't match real user questions is detected and de-cited.
- AI-generated boilerplate. Detection is increasingly accurate. Once flagged, content gets de-cited even if facts are correct.
- Blocking AI bots while expecting AI citations. Some sites block GPTBot then complain about no ChatGPT visibility. You can't have both.
- Anonymous authorship. Reduces citation rate by ~60%. Always name authors with credentials.
- Removing URLs without 301s. Citation links break, citation history is lost. Always 301 redirect.
- Hiding content behind cookie banners or modals at first paint. AI bots see what's rendered initially. If your modal blocks the content, it's invisible.
- Pure SPAs without SSR. JavaScript-only rendering loses many AI bots. Use SSR or static generation for content pages.
- Stale content with no
dateModified. 3.2× citation drop after 30 days for unfreshened content. - Optimizing for one engine in isolation. Don't lose Perplexity to win ChatGPT. Most optimizations work across engines.
Chapter 11The 30-day action plan
Theory only matters if you ship. Here's a concrete week-by-week plan to take a site from "no GEO" to "actively cited":
Week 1: Technical foundations
- Run an audit on your homepage and top 10 pages. Establish baseline GEO score.
- Patch
robots.txtto explicitly Allow all 27 AI bots listed in Chapter 4. - Verify CDN bot management isn't overriding your robots.txt rules.
- Submit your XML sitemap to Google Search Console and Bing Webmaster.
- Generate
/llms.txtat root. - Generate
/.well-known/ai.txt.
Week 2: Schema deployment
- Add
FAQPageschema to top 5 pages. Use real user questions. - Add
Article+Personauthor markup to all blog posts. - Add
Organization+sameAslinks to Wikidata and LinkedIn. - Validate every schema block in Schema.org and Google Rich Results.
Week 3: Content citability
- Pick 5 highest-traffic pages.
- Add 2-4 expert quotes per 1000 words. Use real attributions, not anonymous.
- Add inline citations to every numeric claim.
- Bold or emphasize source mentions where they appear.
- Update
dateModifiedto today's date.
Week 4: Entity and brand
- Create or update author bios. Add 2+ credentials. Link to LinkedIn/ORCID.
- If your organization isn't on Wikidata, create a Q-number entry.
- Audit NAP consistency across all directories (for local businesses).
- Start tracking AI referral traffic in your analytics.
- Re-run the audit. Compare to Week 1 baseline.
Realistic results from this 30-day plan: composite GEO score lift of 15-30 points (e.g., 60 → 80), AI referral traffic up 2-4× within 60-90 days as the engines re-crawl. The best-case studies (HubSpot's documented 6× AI-trial lift) come from sustained 6-12 month investment, not 30 days. But 30 days gets you visible.
ReferenceGlossary
Twenty-five terms used throughout this guide.
- AEO
- Answer Engine Optimization. Targeting answer surfaces (featured snippets, voice, AI Overviews) where the engine returns a direct answer.
- AI Overviews
- Google's AI-generated answer summaries that appear above traditional search results. Replaced "Search Generative Experience" (SGE) in 2024.
- BreadcrumbList
- Schema.org type for representing page hierarchy. Improves contextual citation.
- ClaudeBot
- Anthropic's web crawler user-agent for Claude search and browse functionality.
- CCBot
- Common Crawl's user-agent. Many LLMs train on Common Crawl, so blocking CCBot reduces training-data presence.
- Composite GEO Score
- A 0-100 score that weights Technical, Citability, Schema, and Entity signals into a single number.
- Confidence label
- One of
Confirmed,Likely,Hypothesis— anti-hallucination markers we attach to every audit finding. - FAQPage
- Schema.org type for question-and-answer content. Highest single-signal AI citation rate.
- GEO
- Generative Engine Optimization. Targeting generative AI engines (ChatGPT, Claude, Perplexity, Gemini).
- GEO-bench
- The 10,000-query benchmark Princeton built to measure GEO tactics.
- GPTBot
- OpenAI's web crawler user-agent for training and ChatGPT browse.
- HowTo
- Schema.org type for step-by-step tutorials.
- JSON-LD
- JSON for Linked Data. The recommended format for Schema.org structured data.
- llms.txt
- An emerging standard at llmstxt.org for site-level AI catalogs. Officially honored by Anthropic for ClaudeBot.
- NAP
- Name / Address / Phone. Consistency across surfaces is critical for local AI search.
- PAWC
- Position-Adjusted Word Count. Princeton's primary visibility metric in their KDD 2024 paper.
- Perplexity
- An AI search engine that emphasizes citation-heavy, academic-style responses.
- PerplexityBot
- Perplexity's index crawler. Distinct from Perplexity-User which handles real-time user fetches.
- Schema.org
- The shared vocabulary for structured data on the web. Used via JSON-LD.
- sameAs
- A Schema.org property linking an entity to its representations elsewhere (Wikidata, Wikipedia, LinkedIn).
- SARIF
- Static Analysis Results Interchange Format. Used for CI/CD integration of code-scanning results.
- SEO
- Search Engine Optimization. Targeting traditional list-of-links search engines.
- SGE
- Search Generative Experience. Google's experimental AI search, renamed to AI Overviews in 2024.
- Speakable
- Schema.org property marking sentences that should be read aloud by voice AI.
- Wikidata
- Open knowledge base of entities. Q-numbers are unique identifiers. Often achievable when Wikipedia is not.
If this guide was useful — start with our free tool. Audit your site in 60 seconds, see your composite GEO score and ranked findings, then install best-aeo-skill to apply fixes with one command.
Run free audit → Install bestaeo → Research foundation → Read SKILL.md →