GEO Experiments: Design, Measure & Learn — 2026 Edition

GEO Experiments is a free guide to running structured, evidence-based tests on AI citation — covering the five-phase GEO Experiment Loop, hypothesis writing, control and treatment setup, proxy metric selection, analysis methodology, and ten ready-to-run experiment templates across all five GEO Stack layers. It is Book #3 in the GEO Lab Library, built on the same experimental methodology used in The GEO Lab’s published research.

Most GEO advice is based on what should work. GEO Experiments is about what does work — on your specific site, for your specific queries, measured against real citation data. The scientific method applied to AI search visibility: one variable, one change, one observation window, one conclusion.

What’s Inside GEO Experiments

Why GEO Needs Experiments

The case for evidence over intuition in AI citation optimisation — and why single-variable testing produces insights that general advice cannot.

Live Data: 330-Query Citation Test Results

Updated March 2026 — Real results from The GEO Lab Console running 330 queries across ChatGPT, Gemini, and Perplexity.

AI citation test results showing Perplexity citing thegeolab.net 12 times out of 330 queries, ChatGPT once, Gemini zero times

Google Search Console data showing 821 impressions but only 7 clicks across 18 pages on thegeolab.net

Data from GEO Lab Console — AI Visibility OS | Next measurement: March 26, 2026

Chapter 1 — The GEO Experiment Loop

The five-phase framework: Hypothesise → Design → Execute → Measure → Learn. How each phase feeds the next, and why the loop never ends.

Chapter 2 — Writing a GEO Hypothesis

The GEO Hypothesis Template: IF / ON / THEN / BECAUSE / MEASURED BY / OVER. How to write a testable, specific prediction mapped to a single GEO Stack layer.

Chapter 3 — Control and Treatment Setup

How to isolate variables correctly. What counts as a control condition. How to avoid contaminating results with multiple simultaneous changes.

Chapter 4 — Proxy Metrics for GEO

What to measure when direct citation attribution is hard: citation rate sampling, brand mention tracking, query testing protocols across ChatGPT, Perplexity, and Gemini.

Chapter 5 — Running the Experiment

The practical execution checklist: making the change, observing the correct window (7–21 days for AI crawl cycles), and documenting without bias.

Chapter 6 — Analysis and Interpretation

How to read GEO experiment results. What counts as a meaningful change. Common interpretation errors and how to avoid them.

Chapter 7 — Reporting and Documentation

How to write an experiment report that builds a reusable body of evidence — and how to share findings with the GEO community.

10 GEO Experiment Templates

Ready-to-run experiments across all five GEO Stack layers: opening sentence structure, FAQ block addition, author schema implementation, heading format changes, and more. Each template includes hypothesis, variable, observation window, and measurement method.

Get All 10 Experiment Templates

Each template includes: hypothesis, control/treatment setup, observation window, measurement method, and analysis framework. Ready to copy and run.

Download the GEO Experiments Ebook (Free PDF)

No email required. No signup. Direct download.

Recommended Experiment Sequence

The optimal order for running GEO experiments — starting with highest-leverage changes on highest-traffic pages.

Common Pitfalls and Advanced Techniques

The most common GEO experiment mistakes: testing too many variables, observing too short a window, and misattributing results. Advanced techniques for multi-page and longitudinal experiments.

Case Studies from The GEO Lab

Real experiments run by The GEO Lab — with hypotheses, methodology, results, and conclusions documented.

Frequently Asked Questions

How do you measure AI citation rate?

AI citation rate is measured by defining a set of target queries, submitting them to AI engines (ChatGPT, Perplexity, Gemini, and Copilot), and recording how often your content is cited in the responses. A citation rate is calculated as: citations received divided by queries tested, expressed as a percentage. GEO Experiments provides a step-by-step query testing protocol and tracking template.

What is a GEO experiment?

A GEO experiment is a structured test that changes one variable on one or more pages, observes the effect on AI citation rate over a defined window (typically 7–21 days), and draws a conclusion about whether the change increased, decreased, or had no effect on citation probability. The key principle is single-variable discipline — changing only one thing at a time so results are attributable.

What is the GEO Experiment Loop?

The GEO Experiment Loop is a five-phase cycle: Hypothesise (form a specific prediction), Design (set up control and treatment conditions), Execute (make the change and wait), Measure (collect citation data), and Learn and Act (analyse results and feed them into the next hypothesis). The loop is continuous — every experiment generates new questions.

How long does a GEO experiment take?

A typical GEO experiment requires a 7–21 day observation window after making the change, to allow AI crawl cycles to process the updated content. Faster-crawling platforms like Perplexity may show results within a week; training-data-dependent platforms like ChatGPT may require longer observation periods.

What GEO experiments should I run first?

The recommended sequence starts with the highest-leverage changes on your highest-traffic pages: (1) opening sentence structure — rewriting to place a direct answer first, (2) FAQ block addition with FAQ schema, (3) heading format changes to match natural query phrasing. These consistently produce the largest measurable citation impact and are covered in the first three experiment templates.

Continue in the GEO Lab Library

Apply the results: The GEO Workbook — 30 daily tasks that put experimental insights into systematic action.
Advanced testing: GEO Authority Playbook — competitive citation intelligence and measurement at scale.
Browse all: thegeolab.net/ebooks

Advanced · The GEO Lab Library · #4

GEO Experiments

Design, Measure, Learn

How to Run Controlled GEO Experiments That Generate Actionable Insights

GEO doesn’t have a click-through rate. There’s no “AI impressions” metric in Search Console.
The only way to know what works is to test systematically, measure with proxy metrics,
and build a body of evidence. This guide shows you exactly how.

What you’ll learn: Hypothesis design for GEO · Control and treatment setup · Proxy metrics for AI citation · Statistical approaches for small samples · Analysis templates · 10 ready-to-run experiment designs

Hypothesis Design

Measurement

Proxy Metrics

Analysis

10 Templates

chapters

proxy metrics

experiment templates

GEO

Stack aligned

thegeolab.net By Artur Ferreira · 2026 Edition · Free for personal & commercial use

Methodology

Why GEO Needs Experiments

The GEO Experiment Loop

Writing a GEO Hypothesis

Control and Treatment Setup

Proxy Metrics for GEO

Running the Experiment

Analysis and Interpretation

Reporting and Documentation

Templates and Reference

10 GEO Experiment Templates

Recommended Experiment Sequence

Data Collection Template

Common Pitfalls and Advanced Techniques

How The GEO Lab Runs Experiments

GEO Experiment Quick Reference

About This Guide

Why GEO Needs Experiments

Part of The GEO Lab Library · thegeolab.net

Traditional SEO has decades of tooling. You can track rankings, impressions, clicks, and conversions for every keyword. GEO has none of that — yet. AI engines don’t report who they cited, when, or why. There’s no “AI Search Console” that shows your citation rate.

This creates a dangerous temptation: to guess. To follow advice without testing it. To assume that what worked for one site will work for yours.

The alternative is to experiment. Run structured tests. Change one variable at a time. Measure with the best proxies available. Build a body of evidence about what gets your content cited — and what doesn’t.

“In GEO, the practitioner who tests
will always outperform the one who follows.”

The GEO Lab · thegeolab.net

The GEO Measurement Problem

❌ What GEO Doesn’t Have

No “AI impressions” metric
No citation click-through rate
No official API for citation tracking
No attribution model for AI traffic
No equivalent of Google Search Console

✔ What We Can Measure

Manual citation checks across AI engines
Structural proxies (extractability scoring)
Entity signal density and consistency
Before/after comparisons on changed pages
Cross-platform citation variance

Who This Guide Is For

This guide is for practitioners who have already read The GEO Pocket Guide or The GEO Field Manual and understand the GEO Stack. It’s for people who want to move beyond theory into evidence — who want to know not just what to change, but whether the change actually worked.

📚 Prerequisite reading: This guide assumes familiarity with the GEO Stack layers (Retrieval Probability, Extractability, Entity Reinforcement, Structural Authority, and System Memory). See The GEO Pocket Guide for an overview or The GEO Field Manual for full detail.

GEO Experiments · Design, Measure, Learn

thegeolab.net

Page 2

Chapter 1 · Framework

The GEO Experiment Loop

Every GEO experiment follows the same five-phase loop. This is the core methodology used in The GEO Lab’s published experiments and the structure you’ll use for every test in this guide.

Hypothesise

Form a specific, testable prediction about what will change AI citation behaviour. Map it to a GEO Stack layer.

Design

Set up control and treatment conditions. Define what changes, what stays the same, and how you’ll measure the difference.

Execute

Make the change. Publish. Wait the appropriate observation window (typically 7–21 days for AI crawl cycles).

Measure

Collect proxy metrics. Run citation checks across ChatGPT, Perplexity, and Gemini. Record results in your tracking template.

Learn & Act

Analyse results. Document findings. Apply what works to other pages. Share your evidence with the GEO community.

🔁 It’s a loop, not a line. Every experiment generates new questions. Your Day 30 results from the Workbook become your first hypothesis for an experiment. The learn phase feeds the next hypothesise phase.

Single-Variable Discipline

The most important rule in GEO experimentation: change one thing at a time. If you rewrite a page’s opening, add schema, update the author bio, and add FAQ blocks in one session, you’ll never know which change drove the result. Isolate your variable.

variable per test

7–21

day observation window

AI engines to check

Mapping Experiments to the GEO Stack

Every experiment should target a specific layer of the GEO Stack. This keeps your testing focused and your results attributable.

GEO Stack Layer	What You’re Testing	Example Variable
Retrieval Probability	Will AI find this content?	Robots.txt rules, sitemap inclusion
Extractability	Can AI extract a quotable passage?	Opening sentence structure, heading format
Entity Reinforcement	Does AI recognise the author/brand?	Author bio presence, schema types
Structural Authority	Does AI trust this content?	Backlinks, brand mentions, citations
System Memory	Does AI remember this over time?	Update frequency, freshness signals

GEO Experiments · Design, Measure, Learn

thegeolab.net

Page 3

Chapter 2 · Hypothesis

Writing a GEO Hypothesis

A good hypothesis is specific, testable, and tied to a single GEO Stack layer. A bad hypothesis is vague, unmeasurable, and tries to test everything at once.

The GEO Hypothesis Template

// GEO Hypothesis Template

IF I change [specific variable]

ON [specific page or set of pages]

THEN [expected measurable outcome]

BECAUSE [reasoning tied to GEO Stack layer]

MEASURED BY [specific proxy metric]

OVER [observation window]

Good vs Bad Hypotheses

❌ Bad Hypothesis

“If I improve my content, AI will cite me more.”

Why it fails: “Improve” is vague. “Content” is unspecific. “More” isn’t measurable. No layer targeted.

✔ Good Hypothesis

“If I rewrite the first two sentences of my top 5 pages to provide direct answers, citation rate for those pages will increase from 10% to 30% within 14 days, because Extractability improves when answers are front-loaded.”

Why it works: Specific variable, specific pages, measurable target, clear reasoning, defined timeline.

10 Hypothesis Starters by GEO Stack Layer

Layer	Hypothesis Starter
Retrieval	“If I submit my sitemap to Google and verify AI crawlers are not blocked…”
Retrieval	“If I add internal links from my homepage to my target pages…”
Extract.	“If I rewrite opening sentences to answer the H2 question directly…”
Extract.	“If I add a Key Takeaway section at the end of each page…”
Entity	“If I add Person schema with full credentials to my author profile…”
Entity	“If I make my author name consistent across site, LinkedIn, and Medium…”
Authority	“If I publish a guest post on an industry blog linking back to my guide…”
Authority	“If I earn brand mentions in 3 Reddit threads within my niche…”
Memory	“If I update my cornerstone guide weekly with fresh data…”
Memory	“If I add a visible ‘Last Updated’ date and modify the content monthly…”

GEO Experiments · Design, Measure, Learn

thegeolab.net

Page 4

Chapter 3 · Experiment Design

Control & Treatment Setup

GEO experiments don’t use traditional A/B testing — you can’t serve different content to AI crawlers. Instead, you use sequential testing (before/after on the same page) or matched-pair testing (similar pages, one changed, one not).

Method 1: Sequential Testing (Before/After)

Best for: Testing content changes on your most important pages. Change the variable, observe the result over time, compare to your baseline.

Control: The page’s performance before the change (baseline citation checks).
Treatment: The same page after the change.
Observation window: 7–21 days minimum (AI crawl cycles vary).

Establish Baseline

Run 10 citation checks across 3 AI engines. Record results. This is your “before” data.

Make One Change

Change only the variable you’re testing. Don’t touch anything else on the page.

Wait 7–21 Days

Allow time for AI engines to re-crawl and re-index. Don’t check too early — patience is data.

Measure Again

Run the same 10 citation checks with the same queries. Record results. Compare to baseline.

Method 2: Matched-Pair Testing

Best for: Isolating variables when you have similar pages. Select two pages with comparable traffic, topic relevance, and current citation rates.

Control: Page A — no changes. Treatment: Page B — change applied.
Comparison: Both pages measured simultaneously over the same window.

Method 3: Cross-Platform Variance Testing

Best for: Understanding which AI engines respond to which signals. Test the same content across ChatGPT, Perplexity, and Gemini.

What you’ll learn: Different engines weight different signals. Perplexity may favour recency. ChatGPT may favour structural clarity. Gemini may favour entity signals. Document the variance.

Choosing Your Method

Scenario	Best Method	Why
Testing content rewrites	Sequential	Most pages don’t have matched pairs
Testing schema addition	Matched-pair	Schema can be added to one page, not the other
Testing what engines prefer	Cross-platform	Same content, different AI responses
Testing author signals	Sequential	Author changes apply site-wide

GEO Experiments · Design, Measure, Learn

thegeolab.net

Page 5

Chapter 4 · Measurement

Proxy Metrics for GEO

Since there’s no official “GEO analytics” dashboard, you need proxy metrics — measurable signals that correlate with AI citation behaviour. The GEO Lab uses five categories of proxy metrics.

The 5 GEO Proxy Metrics

Primary Metric

1. Manual Citation Rate

Search 10 queries across 3 engines. Count citations. Citation rate = citations ÷ total checks × 100. This is your north star metric.

Structural Metric

2. Extractability Score

Score each page section 0–5 on: direct answer opening, question heading, evidence present, self-contained passage, clear attribution.

Entity Metric

3. Entity Signal Density

Count entity signals: author name, schema types, brand mentions, cross-platform consistency, About page links. Score out of 10.

Context Metric

4. Citation Context Analysis

When you ARE cited, what was quoted? Which section? Which sentence? Track the pattern — it reveals what AI finds extractable.

Competitive Metric

5. Competitor Displacement Rate

Track which competitors are cited for your target queries. Over time, are you displacing them? Are new competitors appearing?

How to Calculate Citation Rate

// Citation Rate Formula

queries_tested = 10

engines_checked = 3 // ChatGPT, Perplexity, Gemini

total_checks = queries_tested × engines_checked = 30

times_cited = [count your citations]

citation_rate = (times_cited ÷ total_checks) × 100

// Baseline: most sites start at 0–5%

// Good: 15–25% after 30 days of GEO work

// Excellent: 30%+ sustained over 90 days

Extractability Scoring Rubric

Criterion	0 Points	1 Point
Direct answer in first 2 sentences	Buried or missing	Clear, quotable answer up front
Question-format heading (H2/H3)	Statement or keyword heading	Phrased as a question
Evidence or statistic present	Claims without support	Verifiable data cited
Self-contained passage	Requires context from other sections	Standalone — makes sense in isolation
Clear attribution possible	No author, no date, no source	Author, date, and/or source visible

GEO Experiments · Design, Measure, Learn

thegeolab.net

Page 6

Chapter 5 · Execution

Running the Experiment

An experiment is only as good as its execution. This chapter covers the practical workflow: tools, timing, and data collection discipline.

The Experiment Execution Checklist

📋

Hypothesis documented
Using the template from Chapter 2

📊

Baseline data collected
10 queries × 3 engines before changes

🎯

Single variable identified
Only one thing changing per test

📸

Before screenshot taken
Visual record of current state

✏️

Change implemented
Only the treatment variable modified

📸

After screenshot taken
Visual record of new state

⏱

Observation window set
Calendar reminder for Day 7, 14, 21

📊

Post-change data collected
Same 10 queries × 3 engines after window

Observation Windows by Experiment Type

Type of Change	Min. Window	Recommended	Why
Content rewrite	7 days	14 days	AI needs to re-crawl and re-index
Schema addition	7 days	14 days	Structured data processed in crawl cycle
Author signal changes	14 days	21 days	Entity signals take longer to propagate
Off-page signals	21 days	30+ days	Backlinks and mentions need discovery time
Freshness/update signals	7 days	14 days	AI checks for recent modifications

Tools You’ll Need

For Citation Checks:
→ ChatGPT (free tier works)
→ Perplexity (free tier)
→ Google Gemini (free tier)
→ A spreadsheet for tracking

For Technical Metrics:
→ Google Rich Results Test
→ Google PageSpeed Insights
→ Google Search Console
→ Schema.org validator

⚠️ Critical warning: Prompt sensitivity. AI responses vary by prompt phrasing. Always use the exact same query wording for baseline and post-change checks. Even small wording changes can alter which sources are cited. Document your exact prompts.

GEO Experiments · Design, Measure, Learn

thegeolab.net

Page 7

Chapter 6 · Analysis

Analysis & Interpretation

GEO experiments produce small datasets. You won’t have thousands of data points — you’ll have 30 citation checks. The goal isn’t statistical significance in the academic sense. The goal is directional evidence that informs your next action.

The 4-Question Analysis Framework

Did citation rate change?

Compare baseline vs post-change. Any change ≥10 percentage points is worth noting.

Was the change consistent across engines?

Did all 3 engines respond similarly? Cross-engine consistency strengthens evidence.

What was actually cited?

When you gained citations, which specific passage was quoted? Was it from the section you changed?

Can you replicate it?

Apply the same change to a second page. If the pattern holds, you have a genuine finding.

Interpreting Small Samples

Change Observed	Interpretation	Confidence	Next Step
+0% (no change)	No detectable effect	—	Wait longer or test different variable
+3–7% (1–2 citations)	Weak signal — may be noise	Low	Replicate on second page
+10–20% (3–6 citations)	Meaningful directional evidence	Medium	Apply to more pages, monitor
+20%+ (6+ citations)	Strong signal — likely real	High	Roll out broadly, document
Negative (citations lost)	Change may have harmed visibility	Check	Consider reverting, investigate

Confounding Variables to Watch For

⚠️

AI model updates
Model version changes can shift citation patterns globally

⚠️

Competitor changes
A competitor improving their page can displace you

⚠️

Prompt variation
Even slight query rewording changes AI responses

⚠️

Seasonal/news effects
Breaking news on your topic can shift what AI surfaces

GEO Experiments · Design, Measure, Learn

thegeolab.net

Page 8

Chapter 7 · Reporting

Reporting & Documentation

Every experiment should produce a written report. This isn’t bureaucracy — it’s how you build a body of evidence. Six months from now, you’ll have a library of what works for your site.

The GEO Experiment Report Template

GEO EXPERIMENT REPORT

Experiment ID:EXP-001 (sequential number) Date Range:Start → End dates GEO Stack Layer:Which layer this targets Variable Tested:Specific change made Page(s):URL(s) of tested content Method:Sequential / Matched-pair / Cross-platform

Hypothesis:IF… THEN… BECAUSE… (from Ch. 2 template) Baseline:Citation rate before: __% (__ / 30) Result:Citation rate after: __% (__ / 30) Change:+/- __% (absolute change) Confidence:Low / Medium / High

Cross-Engine:ChatGPT: +/-__% · Perplexity: +/-__% · Gemini: +/-__% What Was Cited:Specific passage(s) quoted by AI Replicated?Yes / No / Not yet tested Confounders:Any known variables that may have influenced results

Conclusion: 1–2 sentence summary
Action: Roll out / Revert / Test further / No action
Next Experiment: What question does this raise next?

Building Your Experiment Log

ID	Date	Layer	Variable	Baseline	Result	Δ	Action
EXP-001	Mar 2026	Extract.	Direct answer rewrite	10%	23%	+13%	Roll out
EXP-002	Mar 2026	Entity	Person schema added	23%	27%	+4%	Test further
EXP-003	Apr 2026	Extract.	FAQ section added	27%	33%	+6%	Roll out

💡 Tip: Share your experiment reports publicly. The GEO community needs more evidence, not more opinions. Publish findings on your blog, LinkedIn, or submit them to The GEO Log at thegeolab.net/log.

GEO Experiments · Design, Measure, Learn

thegeolab.net

Page 9

Ready-to-Run · Templates 1–5

10 GEO Experiment Templates

Each template is a complete experiment design you can run immediately. Start with Experiment 1 — it’s the highest-impact, lowest-effort test.

Experiment 1 · Extractability

The Direct Answer Rewrite

Hypothesis: Rewriting the first 2 sentences to directly answer the H2 question will increase citation rate.

Variable: Opening sentence structure (background context → direct answer).

Method: Sequential. Baseline 10 queries, rewrite, wait 14 days, re-check same 10 queries.

Expected impact: High. This is consistently the highest-ROI GEO change.

Experiment 2 · Extractability

Question Headings vs Statement Headings

Hypothesis: Changing H2 headings from statements to questions will increase extractability and citation rate.

Variable: H2 heading format only (e.g. “Schema Markup Benefits” → “What Are the Benefits of Schema Markup?”).

Method: Matched-pair. Two similar pages, one converted, one unchanged.

Expected impact: Medium. Questions map directly to AI query patterns.

Experiment 3 · Entity Reinforcement

Author Schema Addition

Hypothesis: Adding Person schema with full credentials will increase citation rate through stronger entity signals.

Variable: Person schema added to author profile (name, title, sameAs links, credentials).

Method: Sequential. Site-wide change, measure across all target queries.

Expected impact: Medium. Entity signals compound over time — measure again at 30 days.

Experiment 4 · Extractability

FAQ Section Impact

Hypothesis: Adding a 5-question FAQ section with FAQ schema will increase the number of queries for which AI cites the page.

Variable: FAQ section added to bottom of page with FAQ schema markup.

Method: Sequential. Test 10 queries including FAQ-specific questions.

Expected impact: Medium-high. FAQ questions are direct query matches.

Experiment 5 · Structural Authority

Evidence Density Test

Hypothesis: Adding 3+ statistics with cited sources will increase AI citation rate for that page.

Variable: Number of statistics/data points (from 0 to 3+), with source attribution.

Method: Sequential. Baseline, add evidence, wait 14 days.

Expected impact: Medium. AI favours verifiable, data-backed claims.

GEO Experiments · Design, Measure, Learn

thegeolab.net

Page 10

Ready-to-Run · Templates 6–10

Experiment Templates (Continued)

Experiment 6 · Retrieval Probability

Internal Linking Boost

Hypothesis: Adding 5 internal links from high-traffic pages to a target page will improve retrieval probability and citation rate.

Variable: Internal link count pointing to target page (from current → current + 5).

Method: Sequential. Measure target page citations before and after links added.

Expected impact: Medium. Internal links signal content importance to crawlers.

Experiment 7 · System Memory

Freshness Signal Test

Hypothesis: Updating content with new data and a visible “Last Updated” date will increase citation rate for time-sensitive queries.

Variable: Content freshness (old data → new data + visible update date).

Method: Sequential. Best tested on content with “2024” or “2025” data that can be updated to 2026.

Expected impact: Medium-high for time-sensitive topics. Low for evergreen content.

Experiment 8 · Cross-Platform

Engine Citation Variance

Hypothesis: Different AI engines cite different sources for the same query, revealing engine-specific signal preferences.

Variable: None — this is an observational study, not an intervention.

Method: Cross-platform. Same 20 queries across ChatGPT, Perplexity, and Gemini.

Expected impact: Diagnostic. Reveals which engine to optimise for first.

Experiment 9 · Entity Reinforcement

Brand Mention Amplification

Hypothesis: Generating 10 genuine brand mentions across LinkedIn, Reddit, and forums over 30 days will improve citation rate.

Variable: Off-page brand mention count (baseline → baseline + 10).

Method: Sequential with 30-day window.

Expected impact: Medium. Off-page signals take time but compound with on-page quality.

Experiment 10 · Extractability

Key Takeaway Section Test

Hypothesis: Adding a structured “Key Takeaway” section at the end provides AI with a pre-packaged summary to cite.

Variable: Presence of a 3–5 bullet “Key Takeaway” section at page bottom.

Method: Matched-pair. Two similar pages — one with takeaway section, one without.

Expected impact: Medium. Provides a highly extractable passage.

🔬 Start with Experiment 1. The Direct Answer Rewrite consistently shows the highest impact. Once completed, move to Experiments 4 and 3 — FAQ sections and author schema. Build from the foundation up.

GEO Experiments · Design, Measure, Learn

thegeolab.net

Page 11

Implementation · Running Order

Recommended Experiment Sequence

Don’t run all 10 at once. Sequence them for maximum learning with minimum noise.

Month 1: Content Extractability (Experiments 1, 2, 4)

Start with what you can control completely — your own content structure.

Month 2: Entity & Technical Signals (Experiments 3, 5, 6)

Add schema, evidence, and internal linking. These compound the content improvements.

Month 3: Freshness & Off-Page (Experiments 7, 9)

Update content, build brand mentions. These require more time but have lasting effects.

Ongoing: Observation & Refinement (Experiments 8, 10)

Cross-platform analysis and takeaway testing. Run alongside other experiments.

Data Collection Template

Use this spreadsheet structure for every experiment. One tab per experiment.

Column	What to Record	Example
Date	Date of check	2026-03-15
Phase	Baseline / Post-change	Post-change (Day 14)
Query	Exact query used	What is schema markup?
Engine	AI engine tested	Perplexity
Cited?	Y / N	Y
Source Cited	URL cited by AI	thegeolab.net/schema
Passage Quoted	Text AI cited	“Schema markup is code…”
Notes	Anything notable	Full sentence cited

queries per check

engines per query

data points per check

GEO Experiments · Design, Measure, Learn

thegeolab.net

Page 12

Advanced · Avoiding Mistakes

Common Pitfalls & Advanced Techniques

The 7 Mistakes That Kill GEO Experiments

Changing multiple variables at once

Rewriting content, adding schema, AND updating bios in one session. You won’t know which change mattered.

Checking too early

Measuring after 2 days instead of 14. AI crawl cycles need time. Premature measurement gives false negatives.

Changing query wording between checks

“What is schema?” vs “What is schema markup?” yields different results. Use identical prompts every time.

Not recording baseline data

Without a “before” snapshot, your “after” data is meaningless. Always baseline first.

Drawing conclusions from one test

A single experiment is a data point, not proof. Replicate before rolling out broadly.

Ignoring cross-engine variance

A change might work on Perplexity but not ChatGPT. Always check all three engines.

Not documenting results

Running experiments without reports means losing your learnings. Every experiment needs a written report.

Advanced: Compound Testing

Once you’ve established individual variable effects through isolated tests, you can begin compound testing — combining multiple proven changes and measuring the combined effect.

The Compound Sequence:
Phase 1: Test variable A alone (e.g. direct answer rewrite → +13%)
Phase 2: Test variable B alone (e.g. FAQ section → +6%)
Phase 3: Apply A + B together → Measure combined effect
If combined > A + B individually, the changes compound (synergy)
If combined ≈ A + B, the changes are additive
If combined < A + B, there may be diminishing returns

GEO Experiments · Design, Measure, Learn

thegeolab.net

Page 13

Case Studies · From The GEO Lab

How The GEO Lab Runs Experiments

The GEO Lab publishes controlled experiments at thegeolab.net/log. Here’s the methodology behind our published work — and what we’ve learned so far.

Our Testing Principles

🔬 Scientific Rigour
Every test uses the Experiment Loop. Hypotheses are documented before changes are made. We never retrofit explanations to results.

📊 Public Data
All results — including failures — are published in The GEO Log. Negative results are as valuable as positive ones.

🔁 Replication First
We don’t declare findings from a single test. Every significant result is replicated on at least one additional page.

📋 Full Documentation
Every experiment uses the report template from Chapter 7. Exact queries, exact dates, exact results.

What We’ve Learned So Far

After dozens of published experiments, here are the patterns that have held up consistently:

Finding	Confidence	Impact
Direct answer openings dramatically increase citation probability	High	★★★★★
FAQ sections with schema generate citations for long-tail queries	High	★★★★
Question-format H2 headings improve extractability scores	High	★★★★
Statistics with sources increase AI’s willingness to cite	Medium-High	★★★
Author schema improves entity recognition over time	Medium	★★★
Different AI engines cite different sources for identical queries	High	★★★
“Last Updated” freshness signals affect time-sensitive queries	Medium	★★

📖 Read the full experiments: All GEO Lab experiments with complete data are published at thegeolab.net/log — free to read, reference, and build on.

GEO Experiments · Design, Measure, Learn

thegeolab.net

Page 14

Quick Reference · At a Glance

GEO Experiment Quick Reference

phases per experiment

variable per test

data points per check

THE EXPERIMENT LOOP

1. Hypothesise → 2. Design → 3. Execute → 4. Measure → 5. Learn & Act

HYPOTHESIS TEMPLATE

IF [variable] ON [page] THEN [outcome] BECAUSE [layer] MEASURED BY [metric] OVER [window]

5 PROXY METRICS

1. Citation Rate · 2. Extractability Score · 3. Entity Signal Density · 4. Citation Context · 5. Competitor Displacement

3 TESTING METHODS

Sequential (before/after) · Matched-pair (A vs B page) · Cross-platform (engine comparison)

“Test everything. Assume nothing.
The data is the strategy.”

The GEO Lab · thegeolab.net

📚 The GEO Lab Library

The GEO Pocket Guide — 2026 Edition

The quick-start companion — understand GEO in 15 minutes.

SEO to GEO: The Complete Framework

The full story — every layer of the GEO Stack explained.

The GEO Field Manual

The practitioner’s handbook — workflows, audits, and checklists.

GEO Experiments — Design, Measure, Learn

📖 You are here. The science behind GEO testing.

The GEO Workbook — 30-Day Action Plan

30 days of daily tasks, templates, and tracking sheets.

GEO for WordPress — Technical Setup Guide

WordPress-specific implementation and plugin configuration.

The GEO Glossary & Quick Reference

60+ terms defined with quick reference cards.

All ebooks free at thegeolab.net/ebooks · By Artur Ferreira · The GEO Lab

“The best answer wins. Not the best-optimised page.”

GEO Experiments · Design, Measure, Learn

thegeolab.net

Page 15

The GEO Lab

thegeolab.net

AI search visibility research, field experiments, and the complete GEO Lab Library — all free.

📖 All Ebooks → thegeolab.net/ebooks

Follow & Connect

𝕏 @TheGEO_Lab in linkedin.com/in/arturgeo

The GEO Lab Library

#1 The GEO Pocket Guide
#2 SEO to GEO: Complete Framework
#3 GEO Experiments ✓
#4 The GEO Workbook
#5 GEO for WordPress
#6 The GEO Glossary
#7 GEO Field Manual
#8 GEO Authority Playbook
#9 AI SEO OS

Free for personal & commercial use · thegeolab.net

GEO Experiments: Design, Measure & Learn — 2026 Edition

What’s Inside GEO Experiments

Why GEO Needs Experiments

Live Data: 330-Query Citation Test Results

Chapter 1 — The GEO Experiment Loop

Chapter 2 — Writing a GEO Hypothesis

Chapter 3 — Control and Treatment Setup

Chapter 4 — Proxy Metrics for GEO

Chapter 5 — Running the Experiment

Chapter 6 — Analysis and Interpretation

Chapter 7 — Reporting and Documentation

10 GEO Experiment Templates

Recommended Experiment Sequence

Common Pitfalls and Advanced Techniques

Case Studies from The GEO Lab

Frequently Asked Questions

How do you measure AI citation rate?

What is a GEO experiment?

What is the GEO Experiment Loop?

How long does a GEO experiment take?

What GEO experiments should I run first?

Continue in the GEO Lab Library

Contents

Why GEO Needs Experiments

The GEO Measurement Problem

❌ What GEO Doesn’t Have

✔ What We Can Measure

Who This Guide Is For

The GEO Experiment Loop

Single-Variable Discipline

Mapping Experiments to the GEO Stack

Writing a GEO Hypothesis

The GEO Hypothesis Template

Good vs Bad Hypotheses

❌ Bad Hypothesis

✔ Good Hypothesis

10 Hypothesis Starters by GEO Stack Layer

Control & Treatment Setup

Method 1: Sequential Testing (Before/After)

Method 2: Matched-Pair Testing

Method 3: Cross-Platform Variance Testing

Choosing Your Method

Proxy Metrics for GEO

The 5 GEO Proxy Metrics

Primary Metric

Structural Metric

Entity Metric

Context Metric

Competitive Metric

How to Calculate Citation Rate

Extractability Scoring Rubric

Running the Experiment

The Experiment Execution Checklist

Observation Windows by Experiment Type

Tools You’ll Need

Analysis & Interpretation

The 4-Question Analysis Framework

Interpreting Small Samples

Confounding Variables to Watch For

Reporting & Documentation

The GEO Experiment Report Template

Building Your Experiment Log

10 GEO Experiment Templates

Experiment Templates (Continued)

Recommended Experiment Sequence

Data Collection Template

Common Pitfalls & Advanced Techniques

The 7 Mistakes That Kill GEO Experiments

Advanced: Compound Testing

How The GEO Lab Runs Experiments

Our Testing Principles

What We’ve Learned So Far

GEO Experiment Quick Reference

📚 The GEO Lab Library