Tier 1 vs Tier 2 Query Citation Gap: Experimental Evidence The GEO Lab

Tier 1 vs Tier 2 Query Citation Gap: Experimental Evidence

By Artur Ferreira · The GEO Lab · Published: 21 April 2026 · Version 1.0

25 Tier 2 checks. Zero citations. 25 Tier 1 checks. 20 citations. Same site, same content, five consecutive days. The gap didn’t close once.

TL;DR

E016 ran 30 citation checks per day for five days with no content changes. Perplexity cited thegeolab.net on exactly 4 Tier 1 queries every single day — 40% citation rate, zero variance. On all 5 Tier 2 queries across all 5 days: 0% citation rate, zero variance. The split is not noise. It is the domain authority gate operating as expected — and the first published measurement of the gap between proprietary concept queries and category queries on the same domain.

Key GEO Takeaway

Five days, fifty queries, one result that held without exception: Tier 1 proprietary queries produced a 60–80% citation rate; Tier 2 category queries produced zero. The gap is not a content quality problem — it is a domain authority gate, and it does not close until the site has the authority to compete at the category level.

The Tier 1 vs Tier 2 Citation Gap

The GEO Stack framework distinguishes two types of query based on who else is competing for the answer.

The Tier 1/Tier 2 distinction is built into the GEO Stack measurement framework — Tier 1 queries test the site’s authority for its own branded concepts; Tier 2 queries test competitive category rankings where domain authority is the gating factor.

Tier 1 queries are proprietary concept queries — questions about terms, frameworks, or methodologies that a specific site has coined or is uniquely associated with. Nobody was writing about the GEO Stack, Retrieval Probability, Extractability, or System Memory before The GEO Lab defined them. When an AI platform retrieves an answer to “What is the GEO Stack?”, there is one authoritative source. Domain authority competition is low to zero.

Tier 2 queries are category or commercial queries — questions about a topic class that hundreds of sites address. “Best generative engine optimisation tools” or “how to optimise content for AI search” are questions where Semrush, Search Engine Journal, HubSpot, and a hundred other high-authority domains have published comprehensive answers. An AI platform retrieving content for these queries has no shortage of high-authority options to choose from.

The Tier 1/Tier 2 model predicts that citation rate will be high for proprietary queries and near-zero for category queries — not because the content is better or worse, but because the competition for retrieval is structurally different.

E016 is the first experiment that measures this prediction directly, with a fixed query set and a controlled baseline.

The E016 Experiment Design

E016 was designed to establish the noise floor for AI citation measurement — the baseline citation rate variance that exists independent of any content change. Without this baseline, no content variable experiment is interpretable: any observed change in citation rate could be platform noise rather than a content effect.

The probe query set used in E016 — ten types across both tiers — is documented in the ten probe query types post, which explains why the Tier 1/Tier 2 split was built into the query taxonomy from the start.

The protocol: 10 queries checked once per day for five consecutive days, with no changes made to the site during the measurement window. Same queries, same time each day, same API calls. No new posts, no schema edits, no structural changes.

E016 — Protocol summary

E016:
hypothesis: “Platform citation variance without content changes is measurable and bounded”
query_set: 10 queries (5 Tier 1 proprietary, 5 Tier 2 category)
platforms: Perplexity sonar-pro, ChatGPT gpt-4o (force_web_search=true), Google AIO
checks_per_day: 30 (10 per platform)
duration: 5 consecutive days (2026-04-13 to 2026-04-17)
content_changes: none
purpose: noise floor measurement — not a content variable test

The query set was split evenly: five Tier 1 queries targeting concepts coined at The GEO Lab (the GEO Stack, Retrieval Probability, Extractability, System Memory, LLM readability), and five Tier 2 queries targeting competitive category terms (generative engine optimisation, AI search optimisation tools, how to get cited in AI Overviews, GEO vs SEO, AI search ranking factors).

This split was deliberate. The hypothesis entering E016 was that the two query types would behave differently — but the magnitude of the gap, and whether it would hold stable across five days, was unknown. That is what the experiment measured.

The Five-Day Citation Gap Data

40%

Perplexity Tier 1 citation rate — identical on all 5 days

E016 · 25 checks · 0% variance

Perplexity Tier 2 citation rate — identical on all 5 days

E016 · 25 checks · 0% variance

13.3–20%

Combined citation rate range (all platforms, all queries)

E016 · 150 checks · noise floor

+40pp

Tier 1 vs Tier 2 citation gap on Perplexity — held for 5 consecutive days

E016 · confirmed signal, not noise

Per-day breakdown, Perplexity citations out of 10 queries (5 T1 + 5 T2):

Day	Date	T1 citations / 5	T2 citations / 5	Total / 10
Day 1	2026-04-13	4 / 5 (80%)	0 / 5 (0%)	4 / 10
Day 2	2026-04-14	4 / 5 (80%)	0 / 5 (0%)	4 / 10
Day 3	2026-04-15	4 / 5 (80%)	0 / 5 (0%)	4 / 10
Day 4	2026-04-16	4 / 5 (80%)	0 / 5 (0%)	4 / 10
Day 5	2026-04-17	4 / 5 (80%)	0 / 5 (0%)	4 / 10

Four Tier 1 queries cited on every day: GEO Stack, Extractability, System Memory, Retrieval Probability. One Tier 1 query never cited across all five days: LLM readability — the one T1 term with significant semantic overlap with generic readability queries, where competing high-authority content exists. That single data point is itself informative: even within the Tier 1 category, queries that bleed into competitive territory behave like Tier 2.

Tier 2 performance: five queries, five days, 25 checks — zero citations. Not one Perplexity response for a category query included a thegeolab.net link.

The Domain Authority Gate

The Tier 2 zero result is not a content quality failure. The GEO Lab’s posts on generative engine optimisation are detailed, structured, and compliant with every extractability and entity clarity principle the GEO Stack identifies. They do not get cited in category queries because of a domain authority gate — not because they are inadequate.

Setting up a controlled domain authority experiment requires a clean baseline — the Month 0 baseline protocol explains how to run a five-day noise floor measurement that won’t confound the experiment results.

AI retrieval platforms, when constructing responses to competitive category queries, draw from a candidate pool of high-authority pages. The threshold for entry into this pool is not defined in any public documentation — but the behaviour is observable. Semrush, Moz, Search Engine Journal, and similar properties that have years of backlink equity and established domain authority appear in AI responses to GEO category queries. A site that launched in March 2026 does not, regardless of content quality.

ChatGPT’s behaviour in E016 makes this explicit. Across all five days, ChatGPT cited thegeolab.net on zero Tier 2 queries. On Tier 1 queries — where the domain is the only authoritative source — it produced 0 citations on four of five days and 2 citations on Day 2 (a known non-determinism event). ChatGPT’s web search component applies a stricter authority threshold than Perplexity’s retrieval mechanism. On a domain at this stage of authority development, even Tier 1 queries do not reliably produce ChatGPT citations.

The authority gate is platform-specific and tiered by query competitiveness.

Perplexity crosses the gate for T1 queries at this domain authority level. ChatGPT does not, except under non-deterministic conditions. Google AIO does not for any query type at this stage. The same content, cited or not, depending entirely on which platform is asked.

The Noise Floor and Interpretability Threshold

The primary output of E016 is the noise floor — the range within which citation rate can vary without any content change having occurred. This number is the prerequisite for every subsequent experiment: without it, no delta is interpretable.

Platform	Noise floor (min–max)	Interpretability threshold
Perplexity (standalone)	40%–40% (flat — zero variance)	Move to 30% or 50% required for signal
ChatGPT (standalone)	0%–20% (Day 2 outlier)	Too wide — secondary confirmation only
Google AIO (standalone)	0%–0% (assumed)	No signal baseline established yet
Combined / 30	13.3%–20.0%	Must reach 23%+ to claim signal

The ChatGPT noise floor being 0–20% is not a problem with the experiment design — it is a measurement of the platform’s non-determinism at this domain authority level. On four of five days, ChatGPT returned zero citations. On Day 2, it returned 2. The cause was identified: the force_web_search=true parameter in the API call fires different source retrieval arrays on different calls, and on Day 2 two of those arrays happened to include thegeolab.net. That is not a content signal — it is retrieval randomness. Treating it as a signal would produce false conclusions.

Perplexity’s zero-variance flat line is the most useful number E016 produced. A platform that returns exactly 40% on the same query set five days in a row, with no content changes, is a reliable measurement instrument. Future experiments using Perplexity as the primary platform will be able to claim a ±0% noise floor on Tier 1 queries — any change from 40% is a real effect.

What This Means for Content Strategy

The Tier 1/Tier 2 split is not an argument against publishing category content. It is an argument against expecting AI citations from category content before the authority gate is passed.

Reading the traffic signals that accompany citation rate movement — and distinguishing them from crawler hits — is covered in the AI referral traffic measurement post.

The strategic implication is sequencing. For a new domain, the path to AI citation runs through Tier 1 first — proprietary frameworks, original research, coined terminology, documented experiments. These are the queries where citation is achievable independent of domain authority, because the domain is the only source. Building citation history on Tier 1 queries while developing the domain authority signals required to compete on Tier 2 queries is the logical sequence.

There is a secondary effect worth noting. Tier 1 citations, over time, contribute to the entity recognition signals that affect Tier 2 retrieval. A domain that has been consistently cited by Perplexity for its proprietary concepts develops an entity association that is different from a domain with zero citation history. Whether this entity association meaningfully lowers the authority gate for Tier 2 queries is a hypothesis — one that E003 and subsequent experiments will begin to test.

What E016 establishes is the baseline: the gap exists, it is 40 percentage points wide on Perplexity, and it held flat for five days without a single exception. The gap is the starting position. Closing it is a domain authority problem, not a content problem.

What Comes Next

E016 is complete. The noise floor is established. The freeze on content variable experiments has been lifted as of 18 April 2026.

E003 — the heading format experiment, testing question-form H2s against declarative H2s — is the next scheduled run. The hypothesis: heading format affects the retrieval probability of individual content sections, measurable as a change in per-section citation rate on Tier 1 queries. E016’s flat Perplexity baseline makes this testable: any move from the 40% floor is interpretable as a heading format effect rather than noise.

The longer-term question — whether Tier 2 citation rate changes as domain authority develops — requires a longitudinal measurement track rather than a single experiment. That is the purpose of E014, the monthly citation rate baseline, which will record Tier 1 and Tier 2 citation rates separately every 30 days as the domain ages.

E016 — Final result

E016:
result: COMPLETE
noise_floor_combined: 13.3%–20.0% (Perplexity + ChatGPT + AIO / 30 queries)
noise_floor_perplexity: 40.0%–40.0% (flat — zero variance across 5 days)
tier_1_citation_rate: 80% (4/5 T1 queries cited on every day)
tier_2_citation_rate: 0% (0/5 T2 queries cited on any day)
interpretability_threshold: 23%+ combined / 30 for future experiments
primary_signal_platform: Perplexity (zero variance — reliable instrument)
secondary_platform: ChatGPT (0–20% range — non-deterministic at this authority level)
freeze_lifted: 2026-04-18
next_experiment: E003 (heading format — question vs declarative H2)

Frequently Asked Questions

What is the difference between a Tier 1 and Tier 2 query in GEO?

Tier 1 queries are proprietary concept queries — questions about terms, frameworks, or methodologies that a specific site has coined or is uniquely associated with. Tier 2 queries are category or commercial queries that many competing sites address. In E016, Tier 1 queries produced a 40% Perplexity citation rate across all five days. Tier 2 queries produced 0% across all five days. Same content, same site, same platform — the query type was the only variable.

Why do Tier 2 queries return zero citations for new domains?

AI retrieval platforms apply a domain authority gate to competitive category queries. When multiple high-authority sites address the same query, the platform selects from the strongest signals in its index. A new or low-authority domain does not meet this threshold regardless of content quality. This is not a content problem — it is a domain positioning problem. Tier 2 citation rate cannot be moved by content changes alone until the authority gate is passed.

What is the noise floor established by E016?

The combined citation rate across all 30 daily checks ranged 13.3%–20.0% over five days with no content changes. Perplexity’s standalone rate was flat at 40.0% on all five days — zero variance. ChatGPT ranged 0–20% due to a single Day 2 non-determinism event. The interpretability threshold for future experiments is 23%+ combined — any result at 22% or below is within noise.

How does the Tier 1/Tier 2 split affect content strategy?

New and low-authority domains should publish Tier 1 content first — original frameworks, coined terminology, documented methodologies — because this is the only query category where AI citation is achievable before authority thresholds are met. Tier 2 content will not produce citations until the domain has established competitive authority signals. The sequencing matters: Tier 1 citations build entity recognition that eventually supports Tier 2 retrieval.

Version History

Version 1.0 — 21 April 2026: Initial publication. E016 five-day noise floor data, Tier 1 vs Tier 2 citation gap, interpretability threshold for future experiments.

About the author: The GEO Lab founder Artur Ferreira has 20+ years of experience in SEO and organic growth strategy. He developed the GEO Stack framework and leads research into Generative Engine Optimisation methodologies. Connect on X/Twitter or LinkedIn.

Have questions? Contact The GEO Lab