How ChatGPT picks which lawyers to recommend

Every week a new agency publishes a thread titled "How ChatGPT decides which lawyers to recommend." Most of them are wrong. Not because the authors are sloppy — because the underlying retrieval mechanics aren't widely understood, and the heuristics that work for traditional SEO actively mislead you when applied to AI engines. Here's what's actually happening when ChatGPT names a firm, and what it implies for any law-firm marketer trying to be the named result.

This piece is for marketing leads, agency principals, and partners who've already started buying AI-search work — or are about to — and want to understand the machinery they're paying to influence. We'll cover the retrieval architecture in plain language, the five signals that actually move citation rates for legal queries, the signals everyone fixates on that don't, and what the practical workflow looks like.

The retrieval architecture in two paragraphs

When a user asks ChatGPT "who's the best personal-injury lawyer in San Francisco," the model doesn't search its training data the way a search engine indexes the web. The current production system — GPT-4o, GPT-5, and the o-series reasoning models served through ChatGPT — runs the query through a retrieval layer that pulls a small set of relevant documents from a curated index, hands those documents to the language model as context, and then has the model generate an answer constrained by what's in that context. This pattern is called retrieval-augmented generation, or RAG. The same architecture, with engine-specific variations, runs Perplexity, Google's AI Overviews, Gemini's grounded mode, and Claude's web-search-enabled answers.

Two consequences fall out of this immediately. First, the firms that get named in the answer are usually the firms cited in the retrieved documents — meaning citation lift is downstream of being in the right source documents, not downstream of writing good content on your own site. Second, what counts as a "retrieved document" varies sharply by engine: ChatGPT's web-enabled mode favors a different corpus than Perplexity, which favors a different corpus than Google AI Mode. A firm that's a strong citation source for one engine can be invisible in another. This isn't a bug; it's how the systems are designed.

The five signals that actually move legal-query citations

After auditing AI-search results across roughly 400 buyer-intent prompts in legal categories — personal injury, immigration, family law, business litigation, criminal defense, real estate — across ChatGPT, Perplexity, Google AI Overviews, and Claude, a consistent ordering emerges. Five signals do the heavy lifting. Most of the other variables marketers obsess over are downstream or irrelevant.

1. Entity recognition and consolidation. The single highest-correlation signal with citation rate is whether the AI engine recognizes the firm as a coherent, well-described entity — meaning a Knowledge Graph node or its equivalent in the engine's internal representation, with attributes (practice areas, locations, attorney names, languages spoken) cleanly resolved and consistent across the sources the engine has seen. Firms whose name shows up across Avvo, Super Lawyers, Justia, Google Business Profile, Wikipedia (where applicable), state bar member directories, court records, JD Supra bylines, and their own site — all with consistent NAP data, practice descriptions, and attorney rosters — get treated as a known entity. Firms with fragmented or inconsistent representations get treated as ambiguous text, which AI engines resolve toward whichever firm is consolidated. This is the single biggest reason "the bones look right but nothing's citing us" happens. The firm's content is fine; its entity isn't consolidated.

2. Citation diversity across the source corpus. Engines weight breadth of citation sources heavily. A firm cited in Super Lawyers, Avvo, the local bar association's referral directory, a JD Supra byline, a Chambers ranking, and a local newspaper interview will outrank a firm with twice the volume of citations from a single source type. The reason is straightforward: the engine's retrieval layer pulls a small set of documents per query, and diverse sources produce more independent retrievals than a clustered citation profile. Firms heavily reliant on a single citation channel — typically just Avvo or just their own site — show up sporadically; firms whose authority is distributed across six or eight independent source types show up reliably.

3. Recency-weighted authority. AI engines prefer recent citations over older ones, but the recency window varies by query type. Buyer-intent prompts ("personal injury lawyer San Francisco") favor sources from the past 18–24 months heavily; informational prompts ("statute of limitations for car accident California") weight older authoritative sources more evenly. The implication for firms: stale citation profiles decay faster than most marketers expect. A firm with a great Avvo presence in 2022 that hasn't refreshed its profile, hasn't generated new third-party coverage, and hasn't added recent attorney content is losing AI-search ground every quarter regardless of how strong the older citations are. Recency isn't just a tiebreaker — it's a multiplier.

4. Intent matching at the sub-query level. The buyer who types "best personal injury lawyer SF" is asking a different question than the buyer who types "personal injury lawyer San Francisco Cantonese speaking" or "SF lawyer for rear-end collision settlement." AI engines decompose the surface query into sub-intents (practice area, location, language, case type, outcome stage, fee model) and retrieve against the sub-intents — then synthesize. Firms that have content addressing the sub-intents directly outperform firms with generic practice-area pages, even when the generic firm has more overall authority. This is why a smaller firm with a specific page titled "Cantonese-speaking personal injury attorney in San Francisco" can get cited above a much larger firm whose top page is "Personal Injury Lawyer SF." The smaller firm matched the sub-intent precisely; the larger firm matched the surface query without depth.

5. Compliance signals. This is the one most agencies miss. AI engines — both because their RLHF training rewards careful, conservative outputs and because they were tuned to avoid generating harmful or misleading legal advice — under-cite content that pattern-matches to promotional or non-compliant legal marketing. Pages heavy with superlatives ("best," "top-rated," "preeminent"), predictive outcome language ("we guarantee compensation," "we will get you the result"), and unsubstantiated claims get retrieved at lower rates than pages with restrained, factual language even when the promotional pages have higher traditional SEO signals. In testing, comparable practice-area pages with promotional vs. restrained tone show a roughly 30–45% citation-rate gap in favor of the restrained version on the same engine for the same query. The Cal Bar 7.1 compliance overlay isn't just a regulatory hedge; it's a citation-rate lever. This is the structural finding the PROOF Series #1 study is testing at scale.

What everyone fixates on that doesn't move citations much

It's worth being explicit about what to not spend the bulk of a budget on, because the conventional GEO advice circulating on LinkedIn often majors in minors:

Generic FAQ schema additions. Adding FAQ schema to existing pages produces modest citation lift for informational queries and almost no lift for buyer-intent queries. It's worth doing but it's not the leverage point. Treating it as the leverage point — which a lot of agency proposals do — is misallocation.

llms.txt files. The "llms.txt" convention popularized by some AI-search commentators is, as of mid-2026, not respected by any major AI engine. No production retrieval system queries llms.txt. Firms publishing them aren't wrong to do so, but the citation impact is zero. The convention may evolve; today it isn't doing anything.

Keyword density and traditional on-page SEO. AI engines retrieve based on semantic similarity, not keyword frequency. Optimizing a personal-injury page by repeating "personal injury lawyer San Francisco" 14 times produces no citation lift over a page that uses the phrase once and writes naturally around the topic. Some agencies are still selling keyword-density tuning as a GEO tactic. It isn't.

Site speed and Core Web Vitals. These matter for traditional Google ranking and for user experience, but they don't appear to influence AI-engine citation behavior in any of the tests we've run. A slow site won't lose AI-citation share over a fast one; it'll just frustrate the user who clicks through after the engine cited it.

Massive blog volume. Producing 40 blog posts a month is a popular agency tactic. AI engines retrieve a small handful of documents per query — typically 3 to 12 — and the engine's relevance scoring strongly prefers consolidated, authoritative pages over a wide distribution of thin content. Ten well-built practice-area pages outperform 200 blog posts for citation rate in nearly every test. Volume is the wrong axis.

What this means for the workflow

The five signals above are not equally addressable. Some require structural work; some require ongoing motion. The practical sequencing for a firm starting from a near-zero AI-citation baseline:

First, consolidate the entity. Audit every place the firm and each attorney appear across Avvo, Justia, Super Lawyers, Martindale, the relevant state bar directory, Google Business Profile, Wikipedia (only if the firm is genuinely notable — don't fabricate notability), JD Supra (if attorneys contribute), and any other directory the engines treat as authoritative. Clean up every NAP inconsistency, every practice-area mismatch, every attorney roster discrepancy. This is the highest-leverage week-one work and almost no agency does it.

Second, build the missing intent pages. Identify the 8–15 sub-intent queries your firm actually wants to win — "Cantonese-speaking landlord-tenant lawyer SF," "EB-5 immigration attorney Orange County," "fee-shifting business litigation Bay Area" — and build one well-constructed page per sub-intent. This isn't about volume; it's about coverage of the queries that map to your firm's positioning.

Third, generate citation diversity. Map the citation source set the engines actually pull from for your practice areas and pursue the missing categories deliberately — JD Supra bylines, state bar association content, local press, podcast appearances, third-party rankings (Best Lawyers, Super Lawyers, Chambers), and where credible, niche industry publications. Don't buy citation packages from link farms; the engines penalize cluster patterns.

Fourth, apply compliance review to every output. Run all of the above — pages, bylines, profile copy, FAQ answers — through the ABA Model Rule 7.1 + state-bar checklist before publishing. The compliance overlay is a citation-rate lever, not just a regulatory hedge. The previous piece in this series walks through the checklist in detail.

Fifth, measure citation share-of-voice monthly. Track citation rate by engine, by prompt, over time. Not traffic — citations. Traffic is downstream and gets confounded by paid search and direct visits. Citation share-of-voice on the prompts you care about is the cleanest signal of whether the work is moving the needle.

The honest caveats

Two caveats worth flagging, because the AI-search-marketing space has too many confident claims and not enough caveats.

The retrieval systems are still evolving. ChatGPT's web-enabled retrieval pipeline shipped its current architecture in 2024; OpenAI has shipped meaningful changes to it three times in the past 14 months. Perplexity's source-weighting heuristics have shifted noticeably twice in that window. Google's AI Overviews is itself a moving target as Google tunes the trade-off between Overview citations and traditional organic. Any agency claiming to have "cracked" the algorithms is overstating the stability of the systems. The five signals above are stable in direction — entity consolidation, citation diversity, recency, intent matching, and compliance — but the relative weights shift quarter over quarter.

The data behind the orderings above is observational, not causal. We test prompts, observe what gets cited, and infer the signals from patterns. This is the same epistemic position every public AI-search analyst is in; nobody outside the engine labs has access to the underlying retrieval mechanics. The PROOF Series #1 study is designed to push one corner of this — the compliance question — into a more rigorous causal frame. The other four signals remain best inferences from observed behavior. Treat them as such.

Bottom line

ChatGPT picks lawyers to recommend the way every modern retrieval-augmented system picks anything: by retrieving from a curated source corpus, weighting on entity consolidation and citation diversity, applying recency and intent decomposition, and rewarding content that doesn't pattern-match to promotional or non-compliant marketing. The firms winning AI-search visibility in 2026 are the ones doing structural work — entity cleanup, sub-intent coverage, citation diversity, compliance review — not the ones publishing 40 blog posts a month or stuffing llms.txt files.

If you want to see where your firm currently stands on the five signals above, the WTT Digital free AI Visibility Audit covers the entity-consolidation diagnostic and the top-50-prompt citation baseline. Sixty seconds, no call required. It's not a substitute for the full VERDICT engagement, but it's the cheapest way to see whether the gap is structural or marginal.