geneing
geneing t1_jd6016n wrote
Reply to [R] SPDF - Sparse Pre-training and Dense Fine-tuning for Large Language Models by CS-fan-101
Is this a workaround for the weird Cerebras chip architecture? Would mainstream users who train on GPU benefit?
geneing t1_j3hzfpy wrote
Reply to comment by Intelligent_Rough_21 in [D] Looking for a dataset of Text-To-Speech audiobook-style Speech Synthesis Markup Language (SSML) files by Intelligent_Rough_21
I think what you are looking for is called "expressive TTS". There have been a ton of papers in the last couple of years on the topic. Many provide code.
I've had some success with simply preserving the hidden state of the network from one sentence to the next.
SSML may not be expressive enough for your application.
geneing t1_j3g1gwa wrote
Reply to comment by Intelligent_Rough_21 in [D] Looking for a dataset of Text-To-Speech audiobook-style Speech Synthesis Markup Language (SSML) files by Intelligent_Rough_21
Most likely you are using the original Polly method, which is based on gluing together sounds of different phonemes. That produces monotone speech.
Try Google wavenet. It's available through google cloud api just like Polly.
There's a neural version of Polly, but I never tried it.
geneing t1_j3exq14 wrote
Reply to comment by Intelligent_Rough_21 in [D] Looking for a dataset of Text-To-Speech audiobook-style Speech Synthesis Markup Language (SSML) files by Intelligent_Rough_21
Having trained multiple TTS models, I disagree. It's actually quite impressive that prosody is quite accurate. Moreover, even homographs are surprisingly accurate (e.g. word "read" is pronounced with the correct tense if it can be deduced from the sentence)
geneing t1_j3e1573 wrote
Reply to [D] Looking for a dataset of Text-To-Speech audiobook-style Speech Synthesis Markup Language (SSML) files by Intelligent_Rough_21
I looked for it once years ago, but couldn't find any. I don't think it's needed anymore. Current TTS systems based on neural networks are really good at producing speech with the right intonation from just the text.
geneing t1_ivplj76 wrote
Reply to comment by Nameless1995 in [D] What does it mean for an AI to understand? (Chinese Room Argument) - MLST Video by timscarfe
The new machine learning based translators don't really have a set of rules. They essentially learn probabilities of different word combinations. (e.g. https://ai.googleblog.com/2019/10/exploring-massively-multilingual.html), which we could argue should count as "understanding" (since Searle didn't define it clearly).
geneing t1_ivnp4o0 wrote
Reply to [D] What does it mean for an AI to understand? (Chinese Room Argument) - MLST Video by timscarfe
Why are we wasting time on this? Searle made a few subtle mistakes and played a few tricks.
- He never defines what "understand" means. Without a clear definition, he can play rhetorical tricks to support his argument.
- Is it really possible to translate from English to Chinese by just following a book of rules? Have you seen "old" machine translations that were basically following rules - it was trivial to tell machine translation from human translation.
geneing t1_jefrddt wrote
Reply to comment by BitterStatus9 in Just started In Search of Lost Time by Marcel Proust by NotBorris
Interesting. I actually came to the exact opposite impression after finishing all 7 volumes. I felt it was a poorly written book with mostly cartoonishly shallow characters, but with occasional sparks of brilliance and a few interesting and novel ideas and observations. However, on balance it was a disappointment.
Even reading Harold Bloom's essays didn't convince me that it's a great work. He was also mostly focusing on the few great ideas in the book, while glossing over the ridiculous plot twists and simply bad writing and editing.