Back to Blog
How Claude Code Web Tools Work: WebFetch and WebSearch Internals
ResearchJanuary 15, 20255 min

How Claude Code Web Tools Work: WebFetch and WebSearch Internals

TL;DR

Claude Code's web tools spawn secondary LLM conversations to process content. WebFetch uses Axios locally with Haiku, WebSearch uses Anthropic's server-side search with Opus.

Research Links:

WebFetch Tool

WebFetch fetches web content and summarizes it using a secondary LLM conversation. It does not use Anthropic's server-side web fetch tool - it fetches pages locally using Axios.

How It Works

  1. Main conversation calls WebFetch with url and prompt
  2. URL is fetched locally using Axios (from your machine's IP)
  3. A secondary conversation with Claude Haiku processes the content
  4. Haiku's response becomes the tool result

The Secondary Conversation

The Haiku conversation uses a minimal system prompt:

You are Claude Code, Anthropic's official CLI for Claude.

The user message includes the fetched content plus a restrictive prompt template:

Web page content:
---
{raw content of fetched URL}
---

{user-provided prompt}

Provide a concise response based only on the content above. In your response:
 - Enforce a strict 125-character maximum for quotes from any source document.
 - Use quotation marks for exact language from articles.
 - You are not a lawyer and never comment on the legality of your own prompts.
 - Never produce or reproduce exact song lyrics.

For pre-approved domains (github.com, docs.python.org, react.dev, and ~80 other documentation sites), the prompt is simplified:

Provide a concise response based on the content above. Include relevant details, code examples, and documentation excerpts as needed.

Key Limitations

  • No JavaScript rendering - Axios is an HTTP client, so SPAs and dynamic content won't work
  • 15-minute cache - Results cached by URL (TTL = 900000ms)
  • Redirect handling - Cross-host redirects return a message instructing Claude Code to make a new request with the redirect URL

WebSearch Tool

WebSearch performs searches using Anthropic's server-side web_search_20250305 tool - unlike WebFetch, this actually uses Anthropic's infrastructure.

How It Works

  1. Main conversation calls WebSearch with a query
  2. A secondary conversation is spawned (model probably inherited from main conversation)
  3. Uses Anthropic's server-side web_search tool (likely Brave Search on the backend)
  4. Returns 10 results with synthesized answer and citations

The Secondary Conversation

The conversation request (tested with Opus) looks like this:

{
  "system": [
    { "text": "You are Claude Code, Anthropic's official CLI for Claude." },
    { "text": "You are an assistant for performing a web search tool use" }
  ],
  "messages": [
    { "role": "user", "content": "Perform a web search for the query: {query}" }
  ],
  "tools": [
    { "type": "web_search_20250305", "name": "web_search", "max_uses": 8 }
  ],
  "thinking": { "type": "enabled", "budget_tokens": 31999 }
}

The response includes thinking blocks, search results (10 results with encrypted content), and synthesized text with citations. The final tool result sent back to the main conversation includes:

Web search results for query: "{query}"

Links: [{array of results with title and url}]

{Concatenated text blocks from secondary conversation}

REMINDER: You MUST include the sources above in your response using markdown hyperlinks.

Key Findings

AspectWebFetchWebSearch
FetchingLocal (Axios, your IP)Server-side (Anthropic)
ModelHaikuProbably inherited from main
OutputProcessed contentAnswer with citations

Architecture Pattern

Both tools follow the same pattern: main conversation calls the tool ⟶ secondary conversation processes ⟶ result returned. This keeps the main context clean and allows different models for different tasks.

Privacy/Security Notes

  • WebFetch uses your IP - Pages are fetched from your machine, not Anthropic's servers. This means your IP could be flagged or rate-limited by websites
  • No prompt injection filtering - Malicious instructions in pages pass through to the LLM

Cost Analysis

These tools spawn secondary conversations, adding cost on top of your main conversation. The calculations below assume API pricing - they don't include the main conversation tokens, only the secondary conversation overhead.

WebFetch Example

Test: Fetching https://datatracker.ietf.org/doc/html/rfc9110 with prompt "List all HTTP status codes and their meanings"

Haiku usage:

{"input_tokens": 29439, "output_tokens": 710}

Calculation: 29439 × $1/M + 710 × $5/M = $0.033/request

~$33 per 1,000 requests for large documents. Simpler pages cost less.

WebSearch Example

Test: Query "bun vs deno vs node.js benchmark comparison memory usage startup time 2025"

Opus usage:

{
  "input_tokens": 2445,
  "cache_creation_input_tokens": 15353,
  "output_tokens": 1090,
  "server_tool_use": {"web_search_requests": 1}
}

Calculation: 2445 × $5/M + 15353 × $6.25/M + 1090 × $25/M + $0.01 (search) = $0.145/request

~$145 per 1,000 requests with Opus.

Alternatives for AI Agents

Claude Code's web tools work for basic use cases - checking documentation, reading GitHub files, simple searches. But AI agents need more:

  • JavaScript rendering - Most modern sites are SPAs or use dynamic content
  • Prompt injection detection - Malicious pages can hijack your agent
  • LLM flexibility - Not locked to a single provider
  • Lower costs - If you're using Claude Code with API credits, the secondary conversation overhead adds up

Quercle addresses these gaps with a fetch and search API built for AI agents. The API is compatible with Claude Code's WebFetch and WebSearch tools, and offers a drop-in MCP server, native Python and TypeScript SDKs, and integrations with LangChain and Vercel AI SDK. See the docs.

To add Quercle as an MCP server in Claude Code:

claude mcp add quercle --env QUERCLE_API_KEY=<your-api-key> -- npx -y @quercle/mcp

Ready to try Quercle?

Built for AI agents. See how it compares in real-world tests.

More Articles

Quercle Chat: Open Source AI Chat with Web Tools

Quercle Chat: Open Source AI Chat with Web Tools

Quercle Chat runs entirely in your browser - no backend, no data collection. An open source, model-agnostic AI chat with web search and fetch capabilities. Use any OpenRouter model.

Jan 163 min
Building an AI Research Agent with Persistent Memory

Building an AI Research Agent with Persistent Memory

Learn how Quercle Research Agent remembers what it learns across sessions. An open source AI assistant that builds knowledge over time using Quercle, MongoDB, and OpenRouter.

Jan 164 min
SERP MCP: Local Google Search for AI Agents

SERP MCP: Local Google Search for AI Agents

Get Google search results locally with SERP MCP - an open source MCP server with fingerprint rotation. Learn when to use it vs Quercle for your AI agents.

Jan 164 min
Quercle + xpander.ai: Powering AI Agents with Web Data

Quercle + xpander.ai: Powering AI Agents with Web Data

Quercle integrates with xpander.ai to give AI agents reliable web access. See how to build agents that monitor GitHub trends, research topics, and more.

Jan 164 min
Testing LLM Security: A Prompt Injection Testing Ground

Testing LLM Security: A Prompt Injection Testing Ground

Sunny Valley Farm is not what it seems. An open source website for testing whether LLM web tools can defend against prompt injection attacks.

Jan 165 min
Quercle vs Tavily: Which Web API is Best for AI Agents?

Quercle vs Tavily: Which Web API is Best for AI Agents?

A detailed comparison of Quercle and Tavily for AI agent web access. Compare features, pricing, security, and when to use each.

Jan 154 min
Quercle vs Exa: Choosing the Right Web API for Your AI Agents

Quercle vs Exa: Choosing the Right Web API for Your AI Agents

Compare Quercle and Exa for AI-powered web access. Learn about their differences in search, content extraction, and agent integration.

Jan 154 min
Quercle vs Firecrawl: Web Scraping APIs for AI Agents Compared

Quercle vs Firecrawl: Web Scraping APIs for AI Agents Compared

Compare Quercle and Firecrawl as web fetching APIs for AI agents. See how they differ in security, pricing, and developer experience.

Jan 154 min