What is Generative Engine Optimization (GEO)?
GEO is the practice of structuring web content so that generative AI search engines — ChatGPT, Claude, Perplexity, Gemini — find your content quotable and worth citing in their synthesized responses. Formalized in 2024 by a Princeton research team, it's the natural successor to SEO for the era when search returns synthesized answers, not lists of links.
Generative Engine Optimization (GEO) is the discipline of structuring web content to maximize visibility and citation in responses generated by AI search engines that synthesize answers across multiple sources, rather than returning lists of links.
Where traditional SEO optimizes for ranking (will my page appear in the top 10 blue links?), GEO optimizes for citation (will an AI engine choose my content as a source when synthesizing an answer to a user's question?).
GEO addresses a fundamental shift in search behavior: in 2026, 25.11% of all Google queries trigger AI Overviews; 87% of AI referral traffic flows through ChatGPT alone; and AI traffic converts at 5× the rate of traditional Google organic. The companies that get cited by generative engines win the next decade of search.
The term "Generative Engine Optimization" was coined in November 2023 with the arXiv preprint of "GEO: Generative Engine Optimization" by Aggarwal, Murahari, Rajpurohit, Kalyan, Narasimhan, and Deshpande — a Princeton-led team. Before that paper, optimization advice for AI-generated answers existed only as practitioner blog posts and vendor whitepapers, with no peer-reviewed validation.
The paper was formally presented at KDD 2024 — the Association for Computing Machinery's premier data science conference. It introduced three formalisms that became the foundation of the field:
- GEO-bench: a 10,000-query benchmark spanning 9 domains, used to measure citation visibility experimentally.
- Position-Adjusted Word Count (PAWC): a metric that scores how much of a synthesized response is sourced from a given page, weighted by where in the response it appears.
- Subjective Impression: a complementary judges' rating of how prominently a source is featured.
Two years later, the paper has accumulated hundreds of citations and remains the only widely-cited peer-reviewed work that quantifies which tactics actually move the needle. Every other "GEO study" you'll see — from agencies, vendors, or commenters — either cites this paper or makes uncalibrated claims.
For a deeper read on the methodology, see our research foundation page.
Generative engines don't browse the web in real time for every query. They've already crawled, indexed, and partially summarized the web. When a user asks a question, the engine performs three steps:
- Retrieval. Find candidate sources from the index that are relevant to the query. This is similar to traditional search ranking — relevance + authority signals decide which 10–50 sources are candidates.
- Synthesis. Read the candidate sources and write a single coherent answer. The LLM draws information from multiple sources, paraphrases, combines, and produces a response.
- Attribution. Decide which sources to cite. The engine surfaces a subset of the candidates as the citations shown to the user.
GEO is mostly about steps 2 and 3. Step 1 is largely the same problem as SEO: be indexable, be relevant, be authoritative. Steps 2 and 3 are where citability comes in — content structured to be quotable, attributable, and likely to be selected as a source.
Princeton's 2024 paper tested 9 candidate optimization tactics by applying each to source content and re-running 10,000 queries. They measured the change in visibility (PAWC) of the modified source in the new synthesized response. Findings:
| # | Tactic | Visibility impact |
|---|---|---|
| 1 | Source emphasis (bold citations, prominent attribution) | +115% |
| 2 | Expert quotes (2–4 attributed quotations per 1000 words) | +41% |
| 3 | Statistics density (1 numeric claim per 200 words) | +40% |
| 4 | Inline citations (links at point of claim) | +30% |
| 5 | Authority signaling (credentials, named contributors) | +25% |
| 6 | Improved fluency (natural language, varied sentences) | +15% |
| 7 | Easy-to-read (Flesch-Kincaid grade 8–10) | +12% |
| 8 | Topic relevance (one primary topic per page) | +10% |
| 9 | Keyword stuffing | −22% |
The headline finding: source emphasis alone increases citation likelihood by +115% — a 2.15× lift achievable through formatting changes only, no new content needed. Most existing content on the internet under-emphasizes its sources, which is why most sites are dissatisfied with their AI search performance.
Source emphasis was the strongest single effect of any tactic tested. Aggarwal et al., 2024 — Section 5.2
The paper also identified two negative findings: tactics that reduce visibility. Keyword-stuffing was the most prominent, confirming that the tactic that hurts modern Google rankings hurts generative engines even more aggressively.
If your business depends on inbound web traffic, you need GEO. The relevant question is which tactics matter most for your specific surface:
- SaaS / B2B: Be cited when ChatGPT users ask "what's the best tool for X." Focus on FAQPage schema, founder/author authority, and product comparison content with sourced statistics.
- E-commerce: Win product comparison queries in AI Overviews and Perplexity buying guides. Focus on Product schema, AggregateRating markup, and review-based content.
- Publishers / news: Maximize citation frequency in Perplexity, ChatGPT, and AI Overviews. Focus on Article schema with author credentials, freshness, and high citation density.
- Local businesses: Win "near me" queries in AI Overviews. Focus on LocalBusiness schema, NAP consistency, and Knowledge Graph linking.
- DevTools / API docs: Get cited when developers ask Claude "how to do X with Y library." Focus on HowTo and Speakable schema, code-example density, and SDK documentation citability.
Across all use cases, the common foundation is Princeton's 9 tactics — specifically source emphasis, expert quotes, and statistics, which together raise citation likelihood by up to 200%.
The fastest way to begin is to measure where you stand. Run a free GEO audit on your homepage or top-traffic pages. The audit returns a 0–100 composite GEO Score across 4 vectors (Technical, Citability, Schema, Entity), confidence-labeled findings, and a ranked list of fixes with projected score impact.
Most sites fall into the Foundation band (36–67) on first audit. The first 30 days of work typically lift composite scores by 15–30 points. The biggest leverage points, in order:
- Add FAQPage schema — the highest single-signal AI citation surface.
- Allow all 27 AI bots in your robots.txt explicitly, and verify your CDN doesn't override.
- Add author markup with credentials — anonymous authorship reduces citation rate by ~60%.
- Emphasize sources in existing content — Princeton's strongest finding, free to apply.
- Generate /llms.txt — Anthropic honors it for ClaudeBot, and the cost is near-zero.
For the complete week-by-week roadmap, see the Ultimate Guide to GEO & AEO, Chapter 11 — the 30-day action plan.