🗨️ Kokoro Web - Effortless TTS for Open WebUI

warning

This tutorial is a community contribution and is not supported by the Open WebUI team. It serves only as a demonstration on how to customize Open WebUI for your specific use case. Want to contribute? Check out the contributing tutorial.

What is `Kokoro Web`?

Kokoro Web provides a lightweight, OpenAI-compatible API for the powerful Kokoro-82M text-to-speech model, seamlessly integrating with Open WebUI to enhance your AI conversations with natural-sounding voices.

🚀 Two-Step Integration

1. Deploy Kokoro Web API (One Command)

services:
  kokoro-web:
    image: ghcr.io/eduardolat/kokoro-web:latest
    ports:
      - "3000:3000"
    environment:
      # Change this to any secret key to use as your OpenAI compatible API key
      - KW_SECRET_API_KEY=your-api-key
    volumes:
      - ./kokoro-cache:/kokoro/cache
    restart: unless-stopped

Run with: docker compose up -d

2. Connect OpenWebUI (30 Seconds)

In OpenWebUI, go to Admin Panel → Settings → Audio
Configure:
- Text-to-Speech Engine: OpenAI
- API Base URL: http://localhost:3000/api/v1
  (If using Docker: http://host.docker.internal:3000/api/v1)
- API Key: your-api-key (from step 1)
- TTS Model: model_q8f16 (best balance of size/quality)
- TTS Voice: af_heart (default warm, natural english voice). You can change this to any other voice or formula from the Kokoro Web Demo

That's it! Your OpenWebUI now has AI voice capabilities.

🌍 Supported Languages

Kokoro Web supports 8 languages with specific voices optimized for each:

English (US) - en-us
English (UK) - en-gb
Japanese - ja
Chinese - cmn
Spanish - es-419
Hindi - hi
Italian - it
Portuguese (Brazil) - pt-br

Each language has dedicated voices for optimal pronunciation and natural flow. See the GitHub repository for the complete list of language-specific voices or use the Kokoro Web Demo to preview and create your own custom voices instantly.

💾 Optimized Models for Any Hardware

Choose the model that fits your hardware needs:

Model ID	Optimization	Size	Ideal For
model_q8f16	Mixed precision	86 MB	Recommended - Best balance
model_quantized	8-bit	92.4 MB	Good CPU performance
model_uint8f16	Mixed precision	114 MB	Better quality on mid-range CPUs
model_q4f16	4-bit & fp16 weights	154 MB	Higher quality, still efficient
model_fp16	fp16	163 MB	Premium quality
model_uint8	8-bit & mixed	177 MB	Balanced option
model_q4	4-bit matmul	305 MB	High quality option
model	fp32	326 MB	Maximum quality (slower)

✨ Try Before You Install

Visit the Kokoro Web Demo to preview all voices instantly. This demo:

Runs 100% in your browser - No server required
Free forever - No usage limits or registration needed
Zero installation - Just visit the website and start creating
All features included - Test any voice or language immediately

Need More Help?

For additional options, voice customization guides, and advanced settings, visit the GitHub repository.

Enjoy natural AI voices in your OpenWebUI conversations!

What is Kokoro Web?​

🚀 Two-Step Integration​

1. Deploy Kokoro Web API (One Command)​

2. Connect OpenWebUI (30 Seconds)​

🌍 Supported Languages​

💾 Optimized Models for Any Hardware​

✨ Try Before You Install​

Need More Help?​