🔌 Extensibility
Make Open WebUI do anything with Python, HTTP, or community plugins you install in one click.
Open WebUI ships with powerful defaults, but your workflows aren't default. Extensibility is how you close the gap: give models real-time data, enforce compliance rules, add new AI providers, or connect to any external service. Write a few lines of Python, point at an OpenAPI endpoint, or browse the community library. The platform adapts to you, not the other way around.
There are two layers, and most teams end up using both:
- In-process Python (Tools & Functions) runs inside Open WebUI itself with zero infrastructure and instant iteration.
- External HTTP (OpenAPI & MCP servers) connects to services running anywhere, from a sidecar container to a third-party SaaS.
You may still see Pipelines referenced as a third layer. It is legacy and no longer recommended — the heavy-processing problem it solved no longer exists (see Run heavy or long-running work below). Use Functions, Tools, or an external tool server instead. See the Pipelines pages for the full deprecation notice.
Which Extension Do I Need?
The names don't always map obviously to what they do. Start from what you're trying to accomplish:
| I want to... | Use | Why this one |
|---|---|---|
| Let the model call an API or perform an action (and keep a secret/API key the user and model can never read) | Tool | The key lives inside the tool, server-side. The model only sees the result, never the credential. |
| Add a new model or provider to the model selector | Pipe Function | A Pipe appears as a selectable "model" and handles the request however you like. |
| Modify messages going in or out (redact PII, inject system text, log, translate) | Filter Function | Filters run on every message via inlet/outlet/stream without touching model config. |
| Add a button on a message that runs custom code | Action Function | Actions are user-triggered, per-message operations. |
| Teach the model how to approach a task (methodology, steps, house style) | Skill | Skills are instructions, not code. The model reads them; they don't execute anything. |
| Give the model documents to retrieve from | Knowledge | RAG over your files, attached to a model or referenced with #. |
| Save a reusable prompt behind a slash command | Prompt | Templated text with typed variables; expands when you type /name. |
| Connect an existing external service that already speaks HTTP | OpenAPI / MCP server | Point Open WebUI at the spec; endpoints become callable tools. No glue code. |
This is the single most common naming mix-up. A Pipe is a type of Function (in-process Python, adds a provider to the model list). A Pipeline is a separate external worker container. They share a prefix and nothing else. If you want to add a model provider, you almost always want a Pipe Function, not a Pipeline.
Why Extensibility?
Give models real-world abilities
Out of the box, an LLM can only work with what's in its training data and your conversation. Tools let it reach out: check the weather, query a database, call an API, run a calculation. The model decides when to use a tool based on the conversation. You just make the capability available.
Connect any external service
Have an internal API? A third-party SaaS with an OpenAPI spec? An MCP server already running in your stack? Point Open WebUI at the spec and it discovers endpoints automatically, exposing them as tools the model can call. No glue code, no wrappers.
Control every message
Functions let you intercept and transform messages before they reach the model (input filters) or before they reach the user (output filters). Help redact PII, enforce formatting rules, log to an observability platform, inject system instructions dynamically, all without touching model configuration.
Run heavy or long-running work
Open WebUI's backend is fully async. Long-running Tools and Functions (awaiting an external API, a slow query, a multi-step agent) do not block other users, and synchronous/CPU-bound plugin code is offloaded to a worker thread pool (see THREAD_POOL_SIZE) — so it doesn't stall the event loop either. In practice you can run heavy work in-process without the latency problems that older synchronous releases had.
The historical reason to push heavy pipes/filters onto a separate Pipelines worker — keeping the single synchronous event loop unblocked — no longer applies. If you genuinely need GPU access, large or conflicting dependencies, hard isolation, or independent scaling, run that work as an external service behind an OpenAPI or MCP tool server, not a Pipeline.
Import from the community
Browse hundreds of community-built Tools and Functions from the Open WebUI Community site. Find what you need, click Import, and it's live. No pip install, no restart.
Key Features
| 🐍 Tools | Python scripts that give models new abilities: web search, API calls, code execution |
| ⚙️ Functions | Platform extensions that add model providers (Pipes), message processing (Filters), or UI actions (Actions) |
| 🔗 MCP support | Native Streamable HTTP for Model Context Protocol servers |
| 🌐 OpenAPI servers | Auto-discover and expose tools from any OpenAPI-compatible endpoint |
| 📝 Skills | Markdown instruction sets that teach models how to approach specific tasks |
| ⚡ Prompts | Slash-command templates with typed input variables and versioning |
| 🏪 Community library | One-click import of community-built Tools and Functions |
Architecture at a Glance
Understanding which layer to use saves time:
| Layer | Runs where | Best for | Trade-off |
|---|---|---|---|
| Tools & Functions | Inside Open WebUI process | Real-time data, filters, UI actions, new providers — including heavy/long-running work (the async backend keeps it from blocking) | Shares CPU/RAM with the main server |
| OpenAPI / MCP | Any HTTP endpoint | Connecting existing services, third-party APIs, and GPU / heavy-dependency / isolated workloads | Requires a running external server |
Most users start with Tools & Functions. They require no extra setup, have a built-in code editor, and cover the majority of use cases. (Pipelines is a legacy third option, no longer recommended — see the note above.)
Use Cases
Real-time data enrichment
A sales team builds a Tool that queries their CRM API. When a rep asks "What's the latest on the Acme deal?", the model calls the tool, retrieves the pipeline stage, last activity, and deal value, and synthesizes a briefing with live data, not stale training knowledge.
Enterprise compliance filters
A healthcare organization deploys a Filter Function that scans outbound messages for PHI patterns (SSN, MRN, dates of birth). Matches are redacted before the response reaches the user, and the original is logged to their SIEM. No model configuration changes required. The filter runs transparently on every conversation. (This is an illustrative example. Regex-based filtering may not catch all sensitive data patterns. Organizations with compliance requirements should validate filter coverage independently.)
Multi-provider model routing
An engineering team uses Pipe Functions to add Anthropic, Google Vertex AI, and a self-hosted vLLM instance alongside their existing Ollama models. Users see all providers in a single model selector with no separate logins and no API key juggling.
GPU-bound external processing
A research group needs to re-rank retrieval results with a cross-encoder model that requires a GPU. They run it as a small service on a dedicated GPU node and expose it to Open WebUI as an OpenAPI tool server. The model calls it like any other tool while the main instance stays on commodity hardware. (The async backend means lighter custom logic can simply run in-process as a Function — only the GPU dependency pushes this particular workload to a separate service.)
Limitations
Security
Tools and Functions execute arbitrary Python code on your server. Only install extensions from trusted sources, review code before importing, and restrict Workspace access to administrators. See the Security Policy for details.
Resource sharing
In-process Tools and Functions share CPU and memory with Open WebUI. The async backend keeps long-running and blocking work from stalling other requests, but it does not create more hardware — genuinely CPU- or GPU-heavy workloads still compete for the same machine. For those, run the work as an external service behind an OpenAPI / MCP tool server so it scales independently.
MCP transport
Native MCP support is Streamable HTTP only. For stdio or SSE-based MCP servers, use mcpo as a translation proxy.
Dive Deeper
| Topic | What you'll learn |
|---|---|
| Tools & Functions | Writing Python Tools, Functions (Pipes, Filters, Actions), and the development API |
| MCP | Connecting Model Context Protocol servers, OAuth setup, troubleshooting |
| Pipelines (legacy) | Reference only — the deprecated separate-worker framework, superseded by Functions and Tools |