Not yet compiled by me but this should be a C++ reimplementation of OpenAI's whisper speech recognition paired with GPT-2, for completely local processing on many platforms. The webasm port works fine, but slow: https://whisper.ggerganov.com/talk/
https://github.com/ggerganov/whisper.cpp
Update 2022-12-10:
Now compiled, just typ make
. In the models
directory there is a script that can download models for you. The multilingual is the huge ggml-large.bin
. Here's an example of running it on a Swedish wav file with Swedish text output:
./main --model models/ggml-large.bin -l sv -f sv.wav
Usng the small models is super fast, bigger takes time. Not sure if it is because the model takes longer to load or if the search space slows it down continuously.