ChatGPT Cites Only 1.93% of Reddit Pages — What 1.4M Prompts Reveal About AI Citation Mechanics
Ahrefs analyzed 1.4 million ChatGPT prompts and found Reddit is retrieved constantly but almost never cited. Plus: IAB data shows social media ads overtaking search for the first time at $117B vs $114B.
A new study from Ahrefs, published April 15, 2026, offers the most granular look yet at how ChatGPT decides which sources to cite — and which to silently consume. The research analyzed 1.4 million ChatGPT 5.2 desktop prompts from February 2025, tracking every URL retrieved and whether it ultimately received a citation.
The headline finding: ChatGPT's overall citation rate sits at almost exactly 50/50 — 49.98% of retrieved URLs get cited, 50.02% do not. But that average masks dramatic disparities by source type, with Reddit sitting at the extreme end of the "used but never credited" spectrum.
1. The Ahrefs Study: Methodology and Scale
The study, authored by Louise Linehan and Xibeijia Guan, used cosine similarity scores computed from open-source embeddings to approximate ChatGPT's internal semantic matching process. This allowed the researchers to reverse-engineer how the model evaluates title-to-query relevance when deciding which retrieved pages to cite.
Across 1.4 million prompts, ChatGPT retrieved an average of roughly 16.5 URLs per prompt — nearly identical for both cited pages (16.57) and non-cited pages (16.58). This means the retrieval step itself is source-agnostic; the discrimination happens downstream, during the citation selection phase.
2. Citation Rates by Source Type — The Reddit Anomaly
The most striking finding is the citation rate breakdown by source type. Standard web search results are cited at an 88.46% rate — nearly 9 in 10 retrieved search pages make it into the response. Reddit, by contrast, is cited only 1.93% of the time, despite representing an enormous share of the retrieval pool.
| Source Type | Citation Rate | Total Retrieved URLs |
|---|---|---|
| Web Search | 88.46% | 25,563,589 |
| News | 12.01% | 3,940,537 |
| 1.93% | 16,182,976 | |
| YouTube | 0.51% | 953,693 |
| Academia | 0.40% | 185,337 |
The visualization of non-cited URL distribution makes Reddit's role even more stark:
Reddit functions as a massive context reservoir — ChatGPT reads it voraciously to understand sentiment, user experiences, and conversational knowledge, but then cites more "authoritative" web search results instead. More than two-thirds of all uncited URLs in ChatGPT responses come from Reddit.
3. What Actually Gets Cited: URLs, Titles, and Fanout Queries
Beyond source type, three specific factors significantly predict citation probability.
URL Structure: 8.67 Percentage Point Advantage
Pages with natural language, descriptive URL slugs (e.g., /how-to-optimize-meta-descriptions) achieve an 89.78% citation rate compared to 81.11% for opaque or non-semantic URLs (e.g., /article/58291). That 8.67 percentage point gap represents a meaningful optimization lever — something our URL Slug Generator can help with directly.
Title-to-Query Semantic Alignment
Cosine similarity scores between page titles and queries tell a clear story:
| Comparison | Cosine Similarity |
|---|---|
| User prompt vs. cited URL title | 0.602 |
| User prompt vs. non-cited URL title | 0.484 |
| Fanout sub-query vs. cited title (best match) | 0.656 |
The Fanout Query Mechanism — The Most Actionable Finding
ChatGPT does not simply match pages against the user's original prompt. It generates internal "fanout" sub-queries — decomposing the user's question into specific information needs — and then matches pages against these narrower sub-questions. The highest citation probability goes to pages whose titles align with these granular sub-queries (0.656 cosine similarity) rather than the broad original prompt (0.602). This finding aligns with the agentic search patterns we've been tracking, where AI systems decompose queries before acting.
4. Page Age and Authority Signals
The study reveals a strong preference for established content. Cited pages in the search category have a median age of approximately 500 days (~1.3 years), with cited pages observed as old as 2,700+ days (7.4 years). Non-cited pages tend to be significantly younger.
This contrasts with earlier research: a previous Ahrefs study from July 2025 found a median cited page age of 958 days. The shift toward newer-but-still-established pages may reflect ChatGPT's evolving retrieval calibration, but the core pattern holds — fresh-off-the-press content rarely gets cited. This age-bias compounds the broader citation dynamics we analyzed in an earlier 815K-page study.
For news content specifically, the pattern inverts slightly: cited news pages have a median age of ~200 days while non-cited news pages skew older at ~300 days, suggesting recency matters more within the news category where timeliness is intrinsic to value.
What This Tells Us About AI Authority Signals
ChatGPT appears to use page age as a proxy for content that has been validated over time — pages that have accumulated backlinks, user engagement, and indexing history. For evergreen content strategies targeting AI citations, the implication is clear: building authoritative, lasting content pays compounding dividends in the AI era, just as it does in traditional SEO. If you're working on optimizing for AI citation likelihood, our LLM Citation Checker can help score your content across ChatGPT, Perplexity, and Gemini.
5. Actionable SEO Implications for AI Citation Optimization
The Ahrefs study translates into a concrete optimization checklist for sites that want to be cited by AI systems:
Optimize titles for sub-questions, not just topics. Think about the specific fanout queries a user's broad question might decompose into. Structure content to answer these narrow, specific sub-questions explicitly — and reflect that specificity in your <title> tags and H1s.
Use descriptive, semantic URL slugs. The 8.67 percentage point gap is significant. Avoid numeric IDs, hash-based paths, or parameter-heavy URLs. Use human-readable slugs that describe the content.
Prioritize web search retrieval over social/platform channels. The 88.46% vs. 1.93% gap between search and Reddit means that being retrievable via standard web search is overwhelmingly more valuable for citation purposes than appearing through platform-specific channels.
Build content with longevity in mind. Pages aged 1-2 years outperform both brand-new and extremely old content. Create evergreen resources designed to accumulate authority over time, then refresh them periodically rather than publishing net-new pages.
Include structured metadata — but know its limits. Having snippets available correlated with higher citation rates (2.52% of cited search pages had snippets vs. 0.09% of non-cited), but the effect is secondary to title relevance and URL quality. Our AI Overview Optimizer can help assess your content's readiness for AI-driven search features.
6. IAB 2025 Report: Social Media Overtakes Search in Ad Revenue
The Interactive Advertising Bureau's annual report reveals a watershed moment: social media advertising ($117 billion) has overtaken search advertising ($114 billion) as the largest digital ad category in the United States.
The numbers tell a story of momentum shifting: search ad growth decelerated from 15% in 2024 to 11% in 2025, while social surged 32% (a $29 billion year-over-year increase) and digital video accelerated from 19% to 25% growth. This revenue rebalancing adds financial pressure to the AI-generated content arms race already straining organic search quality.
| Category | 2025 Revenue | YoY Growth |
|---|---|---|
| Social Media | $117 billion | +32% |
| Search | $114 billion | +11% (down from 15%) |
| Digital Video | $78 billion | +25% (up from 19%) |
| Commerce Media | $63 billion | +18% |
| Creator Economy | $37 billion | Projected $44B in 2026 |
| Programmatic (total) | $162 billion | +20% |
What This Means for the SEO Industry
Search is not declining — $114 billion is a massive market growing at double digits. But the growth momentum has clearly shifted. The combination of AI-driven search fragmentation (fewer traditional clicks), social commerce maturation (TikTok Shop, Instagram Checkout), and the explosion of retail media networks is redirecting marginal ad dollars away from search. This ad revenue shift compounds the CTR collapse from AI Overviews we covered recently — organic visibility is being squeezed from multiple directions simultaneously.
For SEO practitioners, this reinforces the need to think beyond traditional organic search. AI citation optimization (as the Ahrefs study illustrates), social search optimization, and video SEO represent growth vectors where organic visibility is expanding rather than contracting.
7. Chrome AI Mode Gets Side-by-Side Browsing
Google announced on April 16 that Chrome's AI Mode on desktop now supports side-by-side browsing — clicking a link in AI Mode opens the webpage in a side panel rather than navigating away from the AI interface. The update, announced by Google Search VP Robby Stein and Chrome VP Mike Torres, is currently available in the U.S. with international rollout to follow.
Additional features include a new "plus menu" on Chrome's New Tab page and within AI Mode, allowing users to attach open browser tabs, images, and PDFs as context for their AI searches. Users can now combine multiple sources in a single AI Mode query and access canvas and image creation tools directly.
Frequently Asked Questions
What percentage of Reddit pages does ChatGPT actually cite?
According to Ahrefs' study of 1.4 million ChatGPT 5.2 prompts, Reddit pages are cited only 1.93% of the time they are retrieved. This is despite Reddit comprising 67.8% of all non-cited URLs in ChatGPT's retrieval pool, meaning the AI heavily uses Reddit content for context but almost never attributes it.
What is the overall citation rate for pages retrieved by ChatGPT?
The overall citation rate across all source types is approximately 49.98%. However, this varies dramatically by source type: standard web search results are cited 88.46% of the time, news content 12.01%, Reddit 1.93%, YouTube 0.51%, and academic sources only 0.40%.
Do URL structures affect whether ChatGPT cites a page?
Yes. Pages with natural language, descriptive URL slugs achieve an 89.78% citation rate compared to 81.11% for pages with opaque or non-semantic URLs — an 8.67 percentage point advantage. Readable URLs serve as an additional relevance signal for ChatGPT's citation algorithm.
How much did search advertising grow in 2025 according to the IAB?
Search advertising revenue reached $114 billion in 2025, growing 11% year-over-year — down from 15% growth in 2024. Meanwhile, social media advertising surged 32% to $117 billion, overtaking search as the largest digital ad category for the first time.
What are ChatGPT's fanout queries and why do they matter for SEO?
Fanout queries are internal sub-questions that ChatGPT generates from a user's original prompt. Pages whose titles closely match these sub-queries (cosine similarity 0.656) are significantly more likely to be cited than pages matching only the broad original prompt (0.602). Optimizing for specific, granular questions increases your chances of being cited by AI.
How does page age affect ChatGPT citation probability?
Cited pages tend to be older and more established. The median age of cited search pages is approximately 500 days (about 1.3 years), with cited pages observed up to 2,700+ days old (7.4 years). ChatGPT favors pages with established authority and indexing history over newer content.
What is the total size of the digital advertising market in 2025?
According to the IAB's annual report, total U.S. digital advertising revenue reached $294 billion in 2025, a 13% increase year-over-year. The market is now led by social media ($117B), followed by search ($114B), digital video ($78B), commerce media ($63B), and creator advertising ($37B, projected to reach $44B in 2026).
