π€ Docling Document Extraction
This tutorial is a community contribution and is not supported by the Open WebUI team. It serves only as a demonstration on how to customize Open WebUI for your specific use case. Want to contribute? Check out the contributing tutorial.
π€ Docling Document Extractionβ
This documentation provides a step-by-step guide to integrating Docling with Open WebUI. Docling is a document processing library designed to transform a wide range of file formatsβincluding PDFs, Word documents, spreadsheets, HTML, and imagesβinto structured data such as JSON or Markdown. With built-in support for layout detection, table parsing, and language-aware processing, Docling streamlines document preparation for AI applications like search, summarization, and retrieval-augmented generation, all through a unified and extensible interface.
Prerequisitesβ
- Open WebUI instance
- Docker installed on your system
- Docker network set up for Open WebUI
Integration Stepsβ
Step 1: Run the Docker Command for Docling-Serveβ
docker run -p 5001:5001 -e DOCLING_SERVE_ENABLE_UI=true quay.io/docling-project/docling-serve
*With GPU support:
docker run --gpus all -p 5001:5001 -e DOCLING_SERVE_ENABLE_UI=true quay.io/docling-project/docling-serve-cu124
Step 2: Configure Open WebUI to use Doclingβ
- Log in to your Open WebUI instance.
- Navigate to the
Admin Panel
settings menu. - Click on
Settings
. - Click on the
Documents
tab. - Change the
Default
content extraction engine dropdown toDocling
. - Update the context extraction engine URL to
http://host.docker.internal:5001
. - Save the changes.
Verifying Docling in Docker
To verify that Docling is working correctly in a Docker environment, you can follow these steps: