Viewing a single comment thread. View all comments

blueSGL t1_j0e5p87 wrote

Called it 7 months ago.

I bet if you do a log plot it just destroys the bass.

Edit: Thinking on, this is one dimensional with a second dimension of time, you could slice the audio into three frequency bands and use RGB encoding to 3x the frequency range fidelity without having to change the context window size.

17

[deleted] t1_j0ej7y9 wrote

[deleted]

8

blueSGL t1_j0elvgk wrote

I've not got the hardware needed for fine tuning stable diffusion (or even dreambooth) so I can't test it.

I've only got 10gig of VRAM not the 16 minimum needed.

3