Viewing a single comment thread. View all comments

Anjz OP t1_jdtth3u wrote

In another line of thought similar to what you've just said, we've always had robotic responses from text to speech, but if we apply what we have with current machine learning foundations and train it with huge amounts of audio data on how people talk..

That will be a bit freaky I would think. I would be perplexed and amazed.

12

Embarrassed_Bat6101 t1_jdtvtfx wrote

Well there are already companies now that let you do this with voices, and they sound damn good too. I think all these services are sort of popping up at the same time that they’ll converge on that sort of assistant.

7

Anjz OP t1_jdtwn7m wrote

I'd pay for a Stephen Fry voiceover to narrate my interactions with ChatGPT.

3

Embarrassed_Bat6101 t1_jdtxewz wrote

I can’t remember the one I had found but if you paid like 10 or 20$ a month they would let you upload your own audio and you could make a text to speech voice. I think i saw a post on here the other day where someone did Steve Jobs and it sounded so similar to him it was nuts.

2

Poorfocus t1_jdugyhm wrote

Yeah, it’s called Elevenlabs, a bit cheaper than that for the first tier and it’s really fantastic, I tested it out by recording some friends have natural conversations speaking directly into their microphones on discord (w/ consent!)

One thing is you have to turn down the stability parameter very low from the default or else the intonation is very stiff and it sounds robotic, bring it down to 20% and generate a few times but when it gets the likeness right it’s perfect. To the point where even the person it was emulating found it convincing

I’m curious how it handles the reading, since it’s obviously context aware to some extent. I think we as humans are very keen to picking out “poor acting” and unnatural vocal delivery. I think when that gets improved, we’ll have completely natural language conversations with the ai.

7