Skip to main content

🔌 Extensibility

Make Open WebUI do anything with Python, HTTP, or community plugins you install in one click.

Open WebUI ships with powerful defaults, but your workflows aren't default. Extensibility is how you close the gap: give models real-time data, enforce compliance rules, add new AI providers, or connect to any external service. Write a few lines of Python, point at an OpenAPI endpoint, or browse the community library. The platform adapts to you, not the other way around.

There are two layers, and most teams end up using both:

  • In-process Python (Tools & Functions) runs inside Open WebUI itself with zero infrastructure and instant iteration.
  • External HTTP (OpenAPI & MCP servers) connects to services running anywhere, from a sidecar container to a third-party SaaS.
Pipelines are legacy

You may still see Pipelines referenced as a third layer. It is legacy and no longer recommended — the heavy-processing problem it solved no longer exists (see Run heavy or long-running work below). Use Functions, Tools, or an external tool server instead. See the Pipelines pages for the full deprecation notice.


Which Extension Do I Need?

The names don't always map obviously to what they do. Start from what you're trying to accomplish:

I want to...UseWhy this one
Let the model call an API or perform an action (and keep a secret/API key the user and model can never read)ToolThe key lives inside the tool, server-side. The model only sees the result, never the credential.
Add a new model or provider to the model selectorPipe FunctionA Pipe appears as a selectable "model" and handles the request however you like.
Modify messages going in or out (redact PII, inject system text, log, translate)Filter FunctionFilters run on every message via inlet/outlet/stream without touching model config.
Add a button on a message that runs custom codeAction FunctionActions are user-triggered, per-message operations.
Teach the model how to approach a task (methodology, steps, house style)SkillSkills are instructions, not code. The model reads them; they don't execute anything.
Give the model documents to retrieve fromKnowledgeRAG over your files, attached to a model or referenced with #.
Save a reusable prompt behind a slash commandPromptTemplated text with typed variables; expands when you type /name.
Connect an existing external service that already speaks HTTPOpenAPI / MCP serverPoint Open WebUI at the spec; endpoints become callable tools. No glue code.
"Pipe" vs "Pipeline" — not the same thing

This is the single most common naming mix-up. A Pipe is a type of Function (in-process Python, adds a provider to the model list). A Pipeline is a separate external worker container. They share a prefix and nothing else. If you want to add a model provider, you almost always want a Pipe Function, not a Pipeline.


Why Extensibility?

Give models real-world abilities

Out of the box, an LLM can only work with what's in its training data and your conversation. Tools let it reach out: check the weather, query a database, call an API, run a calculation. The model decides when to use a tool based on the conversation. You just make the capability available.

Connect any external service

Have an internal API? A third-party SaaS with an OpenAPI spec? An MCP server already running in your stack? Point Open WebUI at the spec and it discovers endpoints automatically, exposing them as tools the model can call. No glue code, no wrappers.

Control every message

Functions let you intercept and transform messages before they reach the model (input filters) or before they reach the user (output filters). Help redact PII, enforce formatting rules, log to an observability platform, inject system instructions dynamically, all without touching model configuration.

Run heavy or long-running work

Open WebUI's backend is fully async. Long-running Tools and Functions (awaiting an external API, a slow query, a multi-step agent) do not block other users, and synchronous/CPU-bound plugin code is offloaded to a worker thread pool (see THREAD_POOL_SIZE) — so it doesn't stall the event loop either. In practice you can run heavy work in-process without the latency problems that older synchronous releases had.

The historical reason to push heavy pipes/filters onto a separate Pipelines worker — keeping the single synchronous event loop unblocked — no longer applies. If you genuinely need GPU access, large or conflicting dependencies, hard isolation, or independent scaling, run that work as an external service behind an OpenAPI or MCP tool server, not a Pipeline.

Import from the community

Browse hundreds of community-built Tools and Functions from the Open WebUI Community site. Find what you need, click Import, and it's live. No pip install, no restart.


Key Features

🐍 ToolsPython scripts that give models new abilities: web search, API calls, code execution
⚙️ FunctionsPlatform extensions that add model providers (Pipes), message processing (Filters), or UI actions (Actions)
🔗 MCP supportNative Streamable HTTP for Model Context Protocol servers
🌐 OpenAPI serversAuto-discover and expose tools from any OpenAPI-compatible endpoint
📝 SkillsMarkdown instruction sets that teach models how to approach specific tasks
PromptsSlash-command templates with typed input variables and versioning
🏪 Community libraryOne-click import of community-built Tools and Functions

Architecture at a Glance

Understanding which layer to use saves time:

LayerRuns whereBest forTrade-off
Tools & FunctionsInside Open WebUI processReal-time data, filters, UI actions, new providers — including heavy/long-running work (the async backend keeps it from blocking)Shares CPU/RAM with the main server
OpenAPI / MCPAny HTTP endpointConnecting existing services, third-party APIs, and GPU / heavy-dependency / isolated workloadsRequires a running external server

Most users start with Tools & Functions. They require no extra setup, have a built-in code editor, and cover the majority of use cases. (Pipelines is a legacy third option, no longer recommended — see the note above.)


Use Cases

Real-time data enrichment

A sales team builds a Tool that queries their CRM API. When a rep asks "What's the latest on the Acme deal?", the model calls the tool, retrieves the pipeline stage, last activity, and deal value, and synthesizes a briefing with live data, not stale training knowledge.

Enterprise compliance filters

A healthcare organization deploys a Filter Function that scans outbound messages for PHI patterns (SSN, MRN, dates of birth). Matches are redacted before the response reaches the user, and the original is logged to their SIEM. No model configuration changes required. The filter runs transparently on every conversation. (This is an illustrative example. Regex-based filtering may not catch all sensitive data patterns. Organizations with compliance requirements should validate filter coverage independently.)

Multi-provider model routing

An engineering team uses Pipe Functions to add Anthropic, Google Vertex AI, and a self-hosted vLLM instance alongside their existing Ollama models. Users see all providers in a single model selector with no separate logins and no API key juggling.

GPU-bound external processing

A research group needs to re-rank retrieval results with a cross-encoder model that requires a GPU. They run it as a small service on a dedicated GPU node and expose it to Open WebUI as an OpenAPI tool server. The model calls it like any other tool while the main instance stays on commodity hardware. (The async backend means lighter custom logic can simply run in-process as a Function — only the GPU dependency pushes this particular workload to a separate service.)


Limitations

Security

Tools and Functions execute arbitrary Python code on your server. Only install extensions from trusted sources, review code before importing, and restrict Workspace access to administrators. See the Security Policy for details.

Resource sharing

In-process Tools and Functions share CPU and memory with Open WebUI. The async backend keeps long-running and blocking work from stalling other requests, but it does not create more hardware — genuinely CPU- or GPU-heavy workloads still compete for the same machine. For those, run the work as an external service behind an OpenAPI / MCP tool server so it scales independently.

MCP transport

Native MCP support is Streamable HTTP only. For stdio or SSE-based MCP servers, use mcpo as a translation proxy.


Dive Deeper

TopicWhat you'll learn
Tools & FunctionsWriting Python Tools, Functions (Pipes, Filters, Actions), and the development API
MCPConnecting Model Context Protocol servers, OAuth setup, troubleshooting
Pipelines (legacy)Reference only — the deprecated separate-worker framework, superseded by Functions and Tools
This content is for informational purposes only and does not constitute a warranty, guarantee, or contractual commitment. Open WebUI is provided "as is." See your license for applicable terms.