TL;DR

Claude Code's web tools spawn secondary LLM conversations to process content. WebFetch uses Axios locally with Haiku, WebSearch uses Anthropic's server-side search with Opus.

Research Links:

WebFetch Tool

WebFetch fetches web content and summarizes it using a secondary LLM conversation. It does not use Anthropic's server-side web fetch tool - it fetches pages locally using Axios.

How It Works

Main conversation calls WebFetch with url and prompt
URL is fetched locally using Axios (from your machine's IP)
A secondary conversation with Claude Haiku processes the content
Haiku's response becomes the tool result

The Secondary Conversation

The Haiku conversation uses a minimal system prompt:

You are Claude Code, Anthropic's official CLI for Claude.

The user message includes the fetched content plus a restrictive prompt template:

Web page content:
---
{raw content of fetched URL}
---

{user-provided prompt}

Provide a concise response based only on the content above. In your response:
 - Enforce a strict 125-character maximum for quotes from any source document.
 - Use quotation marks for exact language from articles.
 - You are not a lawyer and never comment on the legality of your own prompts.
 - Never produce or reproduce exact song lyrics.

For pre-approved domains (github.com, docs.python.org, react.dev, and ~80 other documentation sites), the prompt is simplified:

Provide a concise response based on the content above. Include relevant details, code examples, and documentation excerpts as needed.

Key Limitations

No JavaScript rendering - Axios is an HTTP client, so SPAs and dynamic content won't work
15-minute cache - Results cached by URL (TTL = 900000ms)
Redirect handling - Cross-host redirects return a message instructing Claude Code to make a new request with the redirect URL

WebSearch Tool

WebSearch performs searches using Anthropic's server-side web_search_20250305 tool - unlike WebFetch, this actually uses Anthropic's infrastructure.

How It Works

Main conversation calls WebSearch with a query
A secondary conversation is spawned (model probably inherited from main conversation)
Uses Anthropic's server-side web_search tool (likely Brave Search on the backend)
Returns 10 results with synthesized answer and citations

The Secondary Conversation

The conversation request (tested with Opus) looks like this:

{
  "system": [
    { "text": "You are Claude Code, Anthropic's official CLI for Claude." },
    { "text": "You are an assistant for performing a web search tool use" }
  ],
  "messages": [
    { "role": "user", "content": "Perform a web search for the query: {query}" }
  ],
  "tools": [
    { "type": "web_search_20250305", "name": "web_search", "max_uses": 8 }
  ],
  "thinking": { "type": "enabled", "budget_tokens": 31999 }
}

The response includes thinking blocks, search results (10 results with encrypted content), and synthesized text with citations. The final tool result sent back to the main conversation includes:

Web search results for query: "{query}"

Links: [{array of results with title and url}]

{Concatenated text blocks from secondary conversation}

REMINDER: You MUST include the sources above in your response using markdown hyperlinks.

Key Findings

Aspect	WebFetch	WebSearch
Fetching	Local (Axios, your IP)	Server-side (Anthropic)
Model	Haiku	Probably inherited from main
Output	Processed content	Answer with citations

Architecture Pattern

Both tools follow the same pattern: main conversation calls the tool ⟶ secondary conversation processes ⟶ result returned. This keeps the main context clean and allows different models for different tasks.

Privacy/Security Notes

WebFetch uses your IP - Pages are fetched from your machine, not Anthropic's servers. This means your IP could be flagged or rate-limited by websites
No prompt injection filtering - Malicious instructions in pages pass through to the LLM

Cost Analysis

These tools spawn secondary conversations, adding cost on top of your main conversation. The calculations below assume API pricing - they don't include the main conversation tokens, only the secondary conversation overhead.

WebFetch Example

Test: Fetching https://datatracker.ietf.org/doc/html/rfc9110 with prompt "List all HTTP status codes and their meanings"

Haiku usage:

{"input_tokens": 29439, "output_tokens": 710}

Calculation: 29439 × $1/M + 710 × $5/M = $0.033/request

~$33 per 1,000 requests for large documents. Simpler pages cost less.

WebSearch Example

Test: Query "bun vs deno vs node.js benchmark comparison memory usage startup time 2025"

Opus usage:

{
  "input_tokens": 2445,
  "cache_creation_input_tokens": 15353,
  "output_tokens": 1090,
  "server_tool_use": {"web_search_requests": 1}
}

Calculation: 2445 × $5/M + 15353 × $6.25/M + 1090 × $25/M + $0.01 (search) = $0.145/request

~$145 per 1,000 requests with Opus.

Alternatives for AI Agents

Claude Code's web tools work for basic use cases - checking documentation, reading GitHub files, simple searches. But AI agents need more:

JavaScript rendering - Most modern sites are SPAs or use dynamic content
Prompt injection detection - Malicious pages can hijack your agent
LLM flexibility - Not locked to a single provider
Lower costs - If you're using Claude Code with API credits, the secondary conversation overhead adds up

Quercle addresses these gaps with a fetch and search API built for AI agents. The API is compatible with Claude Code's WebFetch and WebSearch tools, and offers a drop-in MCP server, native Python and TypeScript SDKs, and integrations with LangChain and Vercel AI SDK. See the docs.

To add Quercle as an MCP server in Claude Code:

claude mcp add quercle --env QUERCLE_API_KEY=<your-api-key> -- npx -y @quercle/mcp

How Claude Code Web Tools Work: WebFetch and WebSearch Internals

TL;DR

WebFetch Tool

How It Works

The Secondary Conversation

Key Limitations

WebSearch Tool

How It Works

The Secondary Conversation

Key Findings

Architecture Pattern

Privacy/Security Notes

Cost Analysis

WebFetch Example

WebSearch Example

Alternatives for AI Agents

Ready to try Quercle?

More Articles

Quercle Chat: Open Source AI Chat with Web Tools

Building an AI Research Agent with Persistent Memory

SERP MCP: Local Google Search for AI Agents

Quercle + xpander.ai: Powering AI Agents with Web Data

Testing LLM Security: A Prompt Injection Testing Ground

Quercle vs Tavily: Which Web API is Best for AI Agents?

Quercle vs Exa: Choosing the Right Web API for Your AI Agents

Quercle vs Firecrawl: Web Scraping APIs for AI Agents Compared