Original data · 2026-06-12
We asked 4 AI engines the same questions 3 times each. Most brand recommendations didn't survive the repeat.
Everyone screenshots ChatGPT recommending (or snubbing) their product. Almost nobody asks the question twice. We did — systematically. We took one well-known web-analytics product, wrote 25 questions its real buyers ask (“what's the best privacy-friendly analytics tool?”, “alternatives to Google Analytics that don't need cookie banners?”…), and ran every question through ChatGPT, Perplexity, Gemini, and Claude three times. 258 completed answers. Then we counted.
Get your mention rate, not a screenshot — free checkFinding 1: Only 31% of recommended brands appear every time
Across all question-and-engine combinations, the engines named 681 brand-recommendation slots. Just 214 of them — 31.4% — appeared in every repeat run of the same question on the same engine. The other ~69% of brands flickered: on the shortlist in one run, gone in the next.
If you've ever screenshotted an AI answer as proof your brand “is” or “isn't” recommended — that proof has roughly a one-in-three chance of describing the next answer.
Finding 2: Even the subject brand itself flickered in 1 of 4 cases
For the product we audited, we looked at every question-engine pair where it was mentioned at all: in 25% of those pairs, the product appeared in some runs of the identical question and vanished in others. Same engine. Same words. Different answer.
Finding 3: The engines disagree with each other — a lot
Same product, same 25 questions, wildly different visibility per engine:
| Engine | Mention rate |
|---|---|
| Claude (Anthropic) | 95% |
| Gemini (Google) | 92% |
| Perplexity | 76% |
| ChatGPT (OpenAI) | 42% |
A brand can be a fixture on Claude and a coin flip on ChatGPT simultaneously. Monitoring one engine — which is what a quick manual check does — tells you almost nothing about the others. We dug into that engine-by-engine gap across four businesses in which AI engine actually recommends your business: ChatGPT recommended the audited business least of the four, and its rivals most.
Finding 4: Consistency exists — at the top
It wasn't all noise. The product's two strongest competitors appeared in AI answers more consistently than the product itself — one was named in every single run of several questions. The engines do converge on a stable shortlist for each question; the brands on it benefit from a consensus across directories, comparison articles, and reviews that the engines see everywhere they look. The flicker zone is where everyone else lives — visible enough to appear, not established enough to stay.
That's the actionable part: AI visibility isn't a lottery. It's a frequency you can measure, and the gap between “flickering” and “fixture” is a list of sources you're missing.
Methodology
One subject product (web analytics; anonymized). 25 buyer-intent questions written from real search and forum phrasing. Each question submitted 3 independent times to ChatGPT, Perplexity, Gemini, and Claude via API on 2026-06-12 — 300 scheduled answers, 258 completed (engine rate limits cost some runs, OpenAI's the most). Every answer parsed for brand mentions by a judge model; “appears every time” = named in all completed samples of that question-engine pair (pairs with ≥2 completed samples counted).
What this means if you own a brand
Any verdict about your AI visibility based on one answer — yours or a vendor's — is statistically close to meaningless. Visibility is a mention rate, measured per engine, across repeats. That's exactly what an AskedAbout audit measures: 25 of your buyers' questions × 4 engines × 3 samples, with share-of-voice against every competitor the engines name.
Companion reading: which AI engine actually recommends your business breaks this run-to-run data down engine by engine. And if you're weighing a tool to track all of this for you, see the AI visibility tools compared — including the one-time alternative to a monthly subscription.
See your number
A free 60-second check shows what AI says about you.
Method — we query the official APIs of each AI engine, with web search where supported. Answers vary between runs; the full audit repeats every question and reports frequencies, never one-off snapshots.