Acceptable-Cress-374 t1_iyjflu4 wrote
I've been meaning to play around with whisper, but never got the time. Does it do any kind of voice / person segmentation as well? Can it tell speakers apart, say in a high quality input such as a podcast?
forfooinbar t1_iyjzzry wrote
Whisper doesn't do speaker diarization AFAIK. It will just be one big blob of text.
The_frozen_one t1_iykajiv wrote
You can play around with it here: https://whisper.ggerganov.com/
It's significantly slower (approx 50 times slower) than the natively compiled version (https://github.com/ggerganov/whisper.cpp) but you can at least get a sense of accuracy using the online version.
t0mkaka OP t1_iykoh4k wrote
Yes there is no speaker diarization. That will solve problems in this model also and will make search better.
Viewing a single comment thread. View all comments