GitXplorerGitXplorer
J

faster-whisper-livestream-translator

public
77 stars
7 forks
1 issues

Commits

List of commits on branch main.
Verified
c5249ea36b2182a1ac0bbadd9e908d6967bbb055

link faster-whisper

JJonathanFly committed 2 years ago
Verified
7be53a638c5f53deec60c3ff12fb3b6df90c80b9

Update README.md

JJonathanFly committed 2 years ago
Verified
776ec7bad61490cf75e2643d847f3b510b7f3d32

Update README.md

JJonathanFly committed 2 years ago
Verified
40afddd9631cad2c7128838a8f8d2b2e703165b7

Update README.md

JJonathanFly committed 2 years ago
Unverified
b38e22ba11f7ac91d2167221c3ef7845e66537a4

Partially working

JJonathanFly committed 2 years ago
Verified
134a6e54dd3e173a96a60153c9e8c4543c61554c

Initial commit

JJonathanFly committed 2 years ago

README

The README file for this repository.

faster-whisper-livestream-translator with OBS NVIDIA noise reduction

Translating livestreams with faster-whisper, and dual language subtitles

This is mostly just a proof of concept

This code is a mess and mostly broken but somebody asked to see a working example of this setup so I dropped it here.

Would love if somebody fixed or re-implemented these main things in any whisper project:

1. Noise Reduction.

Whisper really needs good noise reduction for some live streams or it barely works. VAD is great but doesn't cut it.

2. Dual language subtitles.

Preferably by running two full Whisper models in parallel and sychronizing the segments, not like this code.

This is super buggy and constantly disconnects, but I got very distracted by https://github.com/JonathanFly/bark and might forget to ever come back to this.

Usage

  1. Install OBS Studio
  2. Install NVIDIA Audio Effects SDK: https://www.nvidia.com/en-us/geforce/broadcasting/broadcast-sdk/resources/
  3. Open OBS setup your stream to stream your desktop audio to a custom rtmp server on localhost. File, Settings, Stream, Service: Custom..., "rtmp://127.0.0.1:12345"
  4. Right click 'Desktop Audio' Audio Mixer, select 'Filters'
  5. Click + bottom left, add Noise Suppression. Pick NVIDIA Noise removal. I set it to max level 100. ('Start Recording' instead of 'Start Streaming' to check the effect on the audio.)
  6. Start python translate_livestream.py
  7. Click Start Streaming

Dual Subs with segment coloring

stream_translate_dual

python translate_livestream.py --direct_url rtmp://127.0.0.1:12345 --model_size_or_path medium --task transcribe --dual_language_subs --language ko --interval 7 --threshold 0.5 --min_silence_duration_ms 1000 --no_segment_probabilities

Transcription with word probabilities

stream_translate_transcribe

python translate_livestream.py --direct_url rtmp://127.0.0.1:12345 --model_size_or_path small --task transcribe --interval 8 --threshold 0.5 --min_silence_duration_ms 2000 --word_timestamps True --history_word_size 0

(You can still run without OBS for noise reduction)

python translate_livestream.py https://www.twitch.tv/xqc --model_size_or_path medium --task transcribe --interval 7 --threshold 0.5 --min_silence_duration_ms 2000 --word_timestamps True --history_word_size 0

Install

git clone https://github.com/JonathanFly/faster-whisper-livestream-translator.git
cd faster-whisper-livestream-translator
pip install -r requirements.txt

How I installed faster-whisper and cuda

mamba create --name faster-whisper python=3.10
mamba activate faster-whisper
mamba install pytorch torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia
mamba install -c conda-forge cudatoolkit=11.8.0

wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-ubuntu2204.pin
sudo mv cuda-ubuntu2204.pin /etc/apt/preferences.d/cuda-repository-pin-600
sudo apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/3bf863cc.pub
sudo add-apt-repository "deb https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/ /"
sudo apt-get update
sudo apt-get install libcudnn8=8.8.1.*-1+cuda11.8

(maybe restart terminal/conda/env)

git clone https://github.com/JonathanFly/faster-whisper-livestream-translator.git
cd faster-whisper-livestream-translator
pip install -r requirements.txt
mamba install ffmpeg
mamba update ffmpeg

Based on https://github.com/fortypercnt/stream-translator