Optimising for Search: the engineering side of AI retrieval

The 3×2 GEO matrix. Each cell is one chapter of the series. You're reading the highlighted cell.

Search mode is not model memory, and it is not a browser reading a page after a user clicks. It is a retrieval step: the AI rewrites the prompt into one or more search queries, asks a search and index layer for candidate pages, optionally fetches the best ones, and then cites the sources that survived that pipeline. Optimising for Search is the engineering side of AI retrieval.

1. What Search mode is: tool-call retrieval and retrieval-time crawling

When an LLM cannot answer a prompt accurately from its internal parametric weights, it uses Search mode. This is RAG-shaped, but operationally it is tool-call retrieval executed on your behalf — the model emits a web_search tool call, the host delegates the query to a search backend, and the results land back in the conversation. In 2026, the vendors expose these capabilities natively in their APIs: Anthropic provides web_search_20260209 and web_fetch_20260209, OpenAI uses web_search within its Responses API, and Gemini relies on google_search and url_context.

Search mode is a brutal funnel. A crawler must be permitted to discover the page, the indexer must rank the candidate URL, the query-rewriter must match the content, and the snippet extractor must pull semantically complete evidence. If you fail at any stage, you are dropped from the candidate set before the synthesiser writes the final answer.

2. Bot UA inventory: understanding the matrix

Do not maintain a single "AI bot" allowlist. You must maintain a purpose matrix. Training crawlers affect future model memory, search crawlers affect retrieval-time candidate selection, and user-fetch agents affect whether a specific URL can be read on demand.

Vendor	Training crawler / control	Search-RAG UA	User-fetch UA
OpenAI	`GPTBot`	`OAI-SearchBot`	`ChatGPT-User`
Anthropic	`ClaudeBot`	`Claude-SearchBot`	`Claude-User`
Perplexity	✗ (None documented)	`PerplexityBot`	`Perplexity-User`
Google	`Google-Extended` (token)	`Googlebot`	User-triggered fetchers vary

Sources verified 2026: OpenAI, Anthropic, Perplexity, Google.

3. llms.txt: the AI-era robots.txt complement

The llms.txt file is an emerging proposal to help LLMs navigate a website at inference time. It is not a permission system, nor is it a ranking lever. It is a compact, model-readable table of contents detailing what your site is and where the canonical facts live.

Major platforms have already adopted it. Perplexity's own crawler documentation explicitly points agents to https://docs.perplexity.ai/llms.txt. Both Cloudflare and Anthropic publish their own files to guide models traversing their technical docs.

What should you put in it? A short blockquote summary with the canonical business description. Links to category hubs, comparison pages, pricing, policies, and Markdown versions of pages. Most importantly, a section for "Do not infer" or "Common confusion" where brand ambiguity is prevalent. Keep it short enough to be consumed quickly, and use /llms-full.txt for comprehensive detail.

4. RAG snippet engineering: chunk sizing and semantic completeness

A retrieval system does not cite "your page"; it cites the piece of the page that best matches its rewritten query. Search-mode optimisation is partly chunk design. Every section on your site should be semantically complete enough to stand alone: subject, answer, qualifiers, date, and source link must exist in one paragraph.

Avoid orphan claims. Stating "this is cheaper" without naming "than what", "in which market", and "as of when" guarantees the AI will drop the snippet for lacking context. Make repeated facts consistent across your page, schema, and llms.txt. Use tables for comparison facts, but ensure there is a plain-text summary above the table to help raw fetchers parse the context.

5. Structured data: explicit clues for AI ingestion

Structured data is not magic AI food; it is a consistency layer. If the visible page, JSON-LD, llms.txt, and sitemap all declare the same facts, retrieval systems have fewer chances to confuse your product, price, author, date, or entity relationships.

Use Article or BlogPosting schema for editorial posts. Use FAQPage schema only where the Q&A is visible to the user and genuinely useful—not stuffed with keywords. Crucially, do not hide facts purely in JSON-LD. Google explicitly warns against adding structured data about information not visible to users.

6. Content velocity: freshness and recrawl windows

Freshness in Search mode is not about publishing more blog posts. It is about making sure the URLs that answer high-value prompts are recently discoverable, recently fetched, and internally linked. Search-mode systems can be sensitive to freshness because retrieval indexes, crawler access, and source recency change over time.

Both OpenAI and Perplexity note that robots.txt updates for search results can take about 24 hours to reflect. As a practical audit trigger, if high-value commercial URLs have not seen relevant AI or search crawler activity for an extended period, treat it as a signal to investigate — our server-log playbook uses 30 days as the alert threshold for missing AI visits.

7. The Brave correlation

Do not assume Googlebot success guarantees Claude Search success. As detailed in our server logs post, alternative indexes play a massive role in AI retrieval. Anthropic explicitly documents the Brave Search API for the Claude for Government MCP web search, and Brave positions its own Search API as an agentic/RAG search backend.

We treat Brave visibility as a practical signal, not a universal vendor guarantee. The lesson here is to test whether high-value URLs are crawlable to alternative indexers and are not blocked by WAF rules that allow Googlebot but block everything else.

8. Query-rewriting models: optimise for what they ask

Users do not search the web directly inside Search mode. The model translates the messy prompt into a cleaner retrieval query. Anthropic's response examples show Claude deciding to search and issuing a separate, highly specific query before returning citations.

You must optimise pages for the query the model is likely to ask ("best fibre broadband Helsinki installation time") rather than only the brand term a human typed. Build a query-rewrite corpus: map user prompts to model search queries, and align your section headings to answer those factual reformulations directly.

9. Anti-patterns

The most common failure in Search mode is policy drift: your SEO allows Googlebot, your security team blocks unfamiliar agents, content teams add FAQ sludge, and schema drifts away from visible copy.

Blocking AI search bots while allowing Googlebot: Blocking OAI-SearchBot or Claude-SearchBot can remove you from their candidate paths once the vendor's crawler and robots systems apply the change (OpenAI documents a ~24-hour robots refresh).
Conflating training and search: Treating GPTBot as the ChatGPT Search crawler is a fundamental error.
Treating Google-Extended as an HTTP UA: It is a product token controlling training and grounding, not an HTTP string you can block in a WAF.
Keyword-overloaded headings: Headings that do not answer a concrete, rewritable retrieval query are ignored by chunking algorithms.
Updating dateModified without changing facts: False freshness is not a durable strategy; keep dateModified tied to material factual changes.

10. Measurement is the next step

Optimising Search mode is only half the work. Once the crawl path is clean, measure it. Prove which bots can reach which URLs, which rewritten queries retrieve them, and which citations appear after the crawl.

Read the companion post to turn these fixes into dashboards: bot coverage, citation share, query rewrites, and URL-level retrieval failures.

Paired post: Measuring Search

Series: Optimising Memory · Optimising Search · Optimising Fetch · Measuring Memory · Measuring Search · Measuring Fetch

Foundation: AI Knowledge Modes — Memory · Search · Fetch