π€ Docling Document Extraction
This tutorial is a community contribution and is not supported by the Open WebUI team. It serves only as a demonstration on how to customize Open WebUI for your specific use case. Want to contribute? Check out the contributing tutorial.
π€ Docling Document Extractionβ
This documentation provides a step-by-step guide to integrating Docling with Open WebUI. Docling is a document processing library designed to transform a wide range of file formatsβincluding PDFs, Word documents, spreadsheets, HTML, and imagesβinto structured data such as JSON or Markdown. With built-in support for layout detection, table parsing, and language-aware processing, Docling streamlines document preparation for AI applications like search, summarization, and retrieval-augmented generation, all through a unified and extensible interface.
Prerequisitesβ
- Open WebUI instance
- Docker installed on your system
- Docker network set up for Open WebUI
Integration Stepsβ
Step 1: Run the Docker Command for Docling-Serveβ
docker run -p 5001:5001 -e DOCLING_SERVE_ENABLE_UI=true quay.io/docling-project/docling-serve
*With GPU support:
docker run --gpus all -p 5001:5001 -e DOCLING_SERVE_ENABLE_UI=true quay.io/docling-project/docling-serve
Step 2: Configure Open WebUI to use Doclingβ
- Log in to your Open WebUI instance.
- Navigate to the
Admin Panel
settings menu. - Click on
Settings
. - Click on the
Documents
tab. - Change the
Default
content extraction engine dropdown toDocling
. - Update the context extraction engine URL to
http://host.docker.internal:5001
. - Save the changes.
Verifying Docling in Docker
To verify that Docling is working correctly in a Docker environment, you can follow these steps:
1. Start the Docling Docker Containerβ
First, ensure that the Docling Docker container is running. You can start it using the following command:
docker run -p 5001:5001 -e DOCLING_SERVE_ENABLE_UI=true quay.io/docling-project/docling-serve
This command starts the Docling container and maps port 5001 from the container to port 5001 on your local machine.
2. Verify the Server is Runningβ
- Go to
http://127.0.0.1:5001/ui/
- The URL should lead to a UI to use Docling
3. Verify the Integrationβ
- You can try uploading some files via the UI and it should return output in MD format or your desired format
Conclusionβ
Integrating Docling with Open WebUI is a simple and effective way to enhance document processing and content extraction capabilities. By following the steps in this guide, you can set up Docling as the default extraction engine and verify itβs working smoothly in a Docker environment. Once configured, Docling enables powerful, format-agnostic document parsing to support more advanced AI features in Open WebUI.