Whishper is an awsome project built on top of LibreTranslate, that we can SelfHost to make local audio transcription.
The Whishper Project
Whishper - Open-source, local-first audio transcription and subtitling with UI
SelfHosting Whishper
Whishper CPU Only
You will need the docker-compose configuration and the .env to specify your variables
You can always have a look at Whishper Official Docs
Whishper Docker-Compose
version: "3.9"
services:
mongo:
image: mongo
env_file:
- .env
restart: unless-stopped
volumes:
- ./whishper_data/db_data:/data/db
- ./whishper_data/db_data/logs/:/var/log/mongodb/
environment:
MONGO_INITDB_ROOT_USERNAME: ${DB_USER:-whishper}
MONGO_INITDB_ROOT_PASSWORD: ${DB_PASS:-whishper}
expose:
- 27017
command: ['--logpath', '/var/log/mongodb/mongod.log']
translate:
container_name: whisper-libretranslate
image: libretranslate/libretranslate:latest
restart: unless-stopped
volumes:
- ./whishper_data/libretranslate/data:/home/libretranslate/.local/share
- ./whishper_data/libretranslate/cache:/home/libretranslate/.local/cache
env_file:
- .env
tty: true
environment:
LT_DISABLE_WEB_UI: True
LT_UPDATE_MODELS: True
expose:
- 5000
networks:
default:
aliases:
- translate
healthcheck:
test: ['CMD-SHELL', './venv/bin/python scripts/healthcheck.py']
interval: 2s
timeout: 3s
retries: 5
whishper:
pull_policy: always
image: pluja/whishper:${WHISHPER_VERSION:-latest}
env_file:
- .env
volumes:
- ./whishper_data/uploads:/app/uploads
- ./whishper_data/logs:/var/log/whishper
container_name: whishper
restart: unless-stopped
networks:
default:
aliases:
- whishper
ports:
- 8082:80
depends_on:
- mongo
- translate
environment:
PUBLIC_INTERNAL_API_HOST: "http://127.0.0.1:80"
PUBLIC_TRANSLATION_API_HOST: ""
PUBLIC_API_HOST: ${WHISHPER_HOST:-}
PUBLIC_WHISHPER_PROFILE: cpu
WHISPER_MODELS_DIR: /app/models
UPLOAD_DIR: /app/uploads
CPU_THREADS: 4
Whishper .env
# Libretranslate Configuration
## Check out https://github.com/LibreTranslate/LibreTranslate#configuration-parameters for more libretranslate configuration options
LT_LOAD_ONLY=es,en,fr
# Whisper Configuration
WHISPER_MODELS=tiny,small
WHISHPER_HOST=http://127.0.0.1:8082
# Database Configuration
DB_USER=whishper
DB_PASS=whishper
Just configure your desired variables and deploy with:
docker-compose up -d
Your Whishper instance will be waiting for you at http://localhost:8082
FAQ
Other F/OSS Audio Transcription Tools
- Bark
- Audiocraft
- LibreTranslate
How to Install AudioCraft?
- Get Python installed
- Get Familiar with Virtual Environments
Installing AudioCraft with Python Step by Step 馃憞
python3 -m venv audiocraft source audiocraft/bin/activate apt install ffmpeg git clone https://github.com/facebookresearch/audiocraft.git ./audio cd audio python -m pip install -r requirements.txt pip install rich python -m demos.musicgen_app --share #deactivate
And the Gradio UI will be available at http://localhost:7860
How to install Bark?
Testing Bark with Python Step by Step 馃憞
#pip install --upgrade setuptools wheel
pip install git+https://github.com/suno-ai/bark.git
pip install git+https://github.com/huggingface/transformers.git
#pip install IPython
Try Bark Audio Generation with:
python -m bark --text "Hello, my name is Suno." --output_filename "example.wav"
or directly in Python:
from transformers import AutoProcessor, BarkModel
processor = AutoProcessor.from_pretrained("suno/bark")
model = BarkModel.from_pretrained("suno/bark")
voice_preset = "v2/en_speaker_6"
inputs = processor("Hello, my dog is cute", voice_preset=voice_preset)
audio_array = model.generate(**inputs)
audio_array = audio_array.cpu().numpy().squeeze()
import scipy
sample_rate = model.generation_config.sample_rate
scipy.io.wavfile.write("bark_out.wav", rate=sample_rate, data=audio_array)