Skip to main content

Agentic Web Search & URL Fetching 🌐

Open WebUI's web search has evolved from simple result injection to a fully agentic research system. By enabling Native Function Calling (Agentic Mode), you allow quality models to independently explore the web, verify facts, and follow links autonomously.

Quality Models Required

Agentic web search works best with frontier models like GPT-5, Claude 4.5+, Gemini 3+, or MiniMax M2.1 that can reason about search results and decide when to dig deeper. Small local models may struggle with the multi-step reasoning required.

Central Tool Documentation

For comprehensive information about all built-in agentic tools (including web search, knowledge bases, memory, and more), see the Native/Agentic Mode Tools Guide.

Native Mode vs. Traditional RAG​

FeatureTraditional RAG (Default)Agentic Search (Native Mode)
Search DecisionOpen WebUI decides based on prompt analysis.The Model decides if and when it needs to search.
Data ProcessingFetches ALL results, chunks them, and performs RAG.Returns Snippets directly; no chunking or Vector DB.
Link FollowingSnippets from top results are injected.Model uses fetch_url to read a Full Page directly.
Model ContextOnly gets relevant fragments (Top-K chunks).Gets the whole text (up to ~50k chars) via fetch_url.
ReasoningModel processes data after system injection.Model can search, read, check, and search again.

How to Enable Agentic Behavior​

To unlock these features, your model must support native tool calling and have strong reasoning capabilities (e.g., GPT-5, Claude 4.5 Sonnet, Gemini 3 Flash, MiniMax M2.1). Administrator-level configuration for these built-in system tools is handled via the Central Tool Calling Guide.

  1. Enable Web Search: Ensure a search engine is configured in Admin Panel > Settings > Web Search.
  2. Enable Native Mode (Agentic Mode):
    • Go to Admin Panel > Settings > Models.
    • Navigate to Model Specific Settings for your target model.
    • Under Advanced Parameters, set Function Calling to Native.
  3. Use a Quality Model: Ensure you're using a frontier model with strong reasoning capabilities for best results.
  4. Chat Features: Ensure the Web Search feature is toggled ON for your chat session.

How Native Tools Handle Data (Agentic Mode)​

πŸ”— It is important to understand that Native Mode (Agentic Mode) works fundamentally differently from the global "Web Search" toggle found in standard models.

search_web (Snippets only)​

When the model invokes search_web:

  • Action: It queries your search engine and receives a list of titles, links, and snippets.
  • No RAG: Unlike traditional search, no data is stored in a Vector DB. No chunking or embedding occurs.
  • Result: The model sees exactly what a human sees on a search results page. If the snippet contains the answer, the model responds. If not, the model must decide to "deep dive" into a link.

fetch_url (Full Page Context)​

If the model determines that a search snippet is insufficient, it will call fetch_url:

  • Direct Access: The tool visits the specific URL and extracts the main text using your configured Web Loader.
  • Raw Context: The extracted text is injected directly into the model's context window (hard-coded truncation at exactly 50,000 characters to prevent context overflow).
  • Agentic Advantage: Because it doesn't use RAG, the model has the "full picture" of the page rather than isolated fragments. This allows it to follow complex instructions on specific pages (e.g., "Summarize the technical specifications table from this documentation link").
tip

By keeping search_web and fetch_url separate and RAG-free, the model acts as its own Information Retrieval agent, choosing exactly which sources are worth reading in full.

Deep Research & Interleaved Thinking πŸ§ β€‹

Because the model can call search_web multiple times and decide autonomously when to dive deeper, it can perform genuine "Deep Research" using Interleaved Thinking. This creates a powerful research loop where the model acts as its own research assistant.

How Interleaved Thinking Works​

Interleaved Thinking is the ability for models to alternate between reasoning and action in a continuous cycle. Instead of searching once and answering, the model can:

πŸ“– Detailed Research Cycle Example

User asks: "What are the latest security vulnerabilities in the React ecosystem?"

Step 1: Initial Analysis πŸ€”

  • Model thinks: "I need current information about React security. Let me start with a broad search."

Step 2: First Search πŸ”

Model calls: search_web("React security vulnerabilities 2026")
  • Receives snippets from 5-10 search results
  • Sees references to npm packages and CVE databases

Step 3: Gap Analysis πŸ€”

  • Model thinks: "The snippets mention CVEs but don't have enough detail. I should fetch a specific security advisory page."

Step 4: Targeted Fetch πŸ“„

Model calls: fetch_url("https://example.com/react-security-advisory-2026")
  • Reads full 50k characters of detailed security documentation
  • Identifies specific vulnerability types (XSS, injection, etc.)

Step 5: Refinement πŸ€”

  • Model thinks: "I found server-side vulnerabilities. Let me search specifically for client-side issues."

Step 6: Refined Search πŸ”

Model calls: search_web("React client-side XSS CVE 2026")
  • Gets more specific results about cross-site scripting issues
  • Identifies mitigation strategies

Step 7: Synthesis πŸ€”

  • Model thinks: "I now have comprehensive information from multiple authoritative sources. Time to synthesize."

Step 8: Final Answer βœ…

  • Provides detailed, grounded response citing multiple sources
  • Includes specific CVE numbers, affected versions, and mitigation steps

The Agentic Research Loop​

The model continuously cycles through these phases until it has sufficient information:

  1. πŸ€” THINK: Analyze current knowledge gaps and determine what information is missing
  2. πŸ” ACT: Search the web or fetch specific URLs to gather relevant content
  3. πŸ“Š EVALUATE: Assess the quality and completeness of the information retrieved
  4. ❓ DECIDE: Determine if more research is needed or if enough context has been gathered
  5. πŸ”„ ITERATE: If gaps remain, return to step 1 with refined focus and more specific queries
  6. βœ… SYNTHESIZE: Once sufficient information is gathered, compile and present the final answer

This cycle repeats autonomously until the model has comprehensive, verified information to answer your question with high confidence.

Key Advantages​

🎯 Adaptive Precision: The model doesn't just search once and accept whatever results appear. Instead, it continuously refines its search strategy based on what it discovers. If initial broad searches return surface-level information, the model automatically pivots to more specific technical terms, product names, version numbers, or specialized terminology. Each iteration becomes progressively more targeted, drilling down from general concepts to specific details, ensuring the final answer is both comprehensive and precise.

πŸ”— Deep Link Following & Discovery: Unlike traditional RAG systems that only use search result snippets, the model can read full pages when snippets aren't sufficient. Even more powerfully, when the model uses fetch_url to read a page, it can discover and follow new URLs mentioned within that content. For example, if a fetched page references a technical specification document, an official changelog, or a related research paper, the model can autonomously call fetch_url again on those discovered URLs to dive even deeper. This creates a natural "web browsing" behavior where the model follows citation chains, explores linked resources, and builds a comprehensive understanding by reading multiple interconnected sourcesβ€”just like a human researcher would.

βœ… Fact Verification & Cross-Referencing: The model can autonomously verify information by cross-referencing multiple independent sources. If one source makes a claim, the model can search for corroborating evidence from authoritative sources, compare version numbers across official documentation, or validate facts against primary sources. This multi-source verification significantly reduces hallucination and increases answer reliability, as the model builds confidence by finding consistent information across diverse, credible sources.

🧩 Intelligent Gap Filling: If initial search results miss key information or only partially address the question, the model identifies these gaps and automatically conducts follow-up searches with different terms, alternative phrasings, or more specific queries. For example, if searching for "React performance issues" doesn't yield information about a specific optimization technique, the model might refine its search to "React useMemo optimization" or "React.memo vs useMemo comparison" to fill the knowledge gap. This ensures comprehensive coverage of complex topics that might require multiple search angles.

🌐 Multi-Source Synthesis: The model doesn't just return information from a single sourceβ€”it synthesizes insights from multiple web pages, documentation sites, forums, and articles into a coherent, well-rounded answer. This synthesis provides broader context, acknowledges different perspectives, and presents a more complete picture than any single source could provide.

πŸ“š Context-Aware Source Selection: The model intelligently decides whether to rely on search snippets (when they contain sufficient information) or to fetch full pages (when deeper detail is needed). It can also determine when to stop researchingβ€”avoiding unnecessary tool calls while ensuring thoroughness. This balance between efficiency and comprehensiveness makes agentic search both fast and reliable.

This iterative loop of Thought β†’ Action β†’ Thought continues until the model has sufficient information to answer your request with maximum accuracy.

Learn More About Interleaved Thinking

For more details on how Interleaved Thinking works across all agentic tools (not just web search), see the Interleaved Thinking Guide.

Next Steps​