LetterRip
LetterRip t1_izksf4k wrote
Reply to comment by Teotz in [P] Using LoRA to efficiently fine-tune diffusion models. Output model less than 4MB, two times faster to train, with better performance. (Again, with Stable Diffusion) by cloneofsimo
It is working, but I need to use prior preservation loss, otherwise all of the words in the phrase have the concept bleed into them. So generating photos for preservation loss now.
LetterRip t1_izdm55i wrote
Reply to comment by cloneofsimo in [P] Using LoRA to efficiently fine-tune diffusion models. Output model less than 4MB, two times faster to train, with better performance. (Again, with Stable Diffusion) by cloneofsimo
> Glad it worked for you with such small memory constraints!
Currently training image size 768, and accumulation steps=2.
If steps is set to 2000, will it be going to 4000? It didn't stop at 2000 as expected and is currently over 3500, figured I'd wait till over 4000 to kill it in case the accumulation steps acts as a multiplier. (Went to 3718 and quit, right after I wrote the above).
LetterRip t1_izdam40 wrote
Reply to [P] Using LoRA to efficiently fine-tune diffusion models. Output model less than 4MB, two times faster to train, with better performance. (Again, with Stable Diffusion) by cloneofsimo
Just tried this and it ran great on a 6GB VRAM card on a laptop with only 16GB of RAM (barely fit into VRAM - using bitsnbytes and xformers I think). I've only tried the corgi example but seemed to work fine. Trying it with a person now.
LetterRip t1_izb2gbt wrote
Reply to [D] Can AI Music Tools Compete with Artists? by Oblipher
For classical, twosetviolin couldn't tell the difference between AI and human composers,
LetterRip t1_iz20yak wrote
Reply to [D] Is there an affordable way to host a diffusers Stable Diffusion model publicly on the Internet for "real-time"-inference? (CPU or Serverless GPU?) by OkOkPlayer
You might consider uploading it to civitai rather than self hosting, then people can download it and/or make it available via a number of free and paid services.
LetterRip t1_ix0zyfv wrote
Reply to [D] BERT related questions by Devinco001
what length of texts? sentence? paragraph? page? multiple pages? books?
A sentence might average 10 tokens, a page 750 tokens, a book 225,000 tokens. So 25 million to 562.5 billion tokens.
LetterRip t1_iwrvtt5 wrote
Reply to [D] If I bought a copy of tv series on Youtube (or other platforms), can I use them for training a model? by DarrenTitor
It may or may not be fair use. Academic usage is a fair use defense, but it will depend on the specific nature of the usage. What will the trained model be used for? Also is the result transformative? Short version talk to a lawyer.
Also different countries have different copyright laws, so it could be much different if you are not in the US.
LetterRip t1_iw3rucf wrote
Reply to [D]We just release a complete open-source solution for accelerating Stable Diffusion pretraining and fine-tuning! by HPCAI-Tech
Could you provide details on the comparison with DeepSpeed? What parameters were used etc?
Also doesn't it provide any benefit for single GPU inference?
LetterRip t1_ivk4gfy wrote
Reply to comment by abstractcontrol in [Project] Rebel Poker AI by Character_Bluejay601
> The academic SOTA is to just stick a tabular algorithm on top of some deep net, which is hardly elegant. All these algorithms are just hacks and I wouldn't use them for real money play.
They absolutely crush the best players in the game, and beat less than the best by absurd amounts.
While there are is a huge action space, it turns out that very few bet sizes are needed on early streets (4 is generally adequate), and the final street can be solved on the go.
LetterRip t1_iuxy54g wrote
Reply to comment by meldiwin in [P] Implementation of MagicMix from ByteDance researchers, - New way to interpolate concepts with much more natural, geometric coherency (implemented with Stable Diffusion!) by cloneofsimo
pretty sure 'novel object' means a image that is the combination of multiple objects so for instance - dog + coffee_pot = dog with some characteristics of a coffee_pot (in the image examples the head was short of coffee pot like). rabbit + tiger = rabbit with tiger charactistics. rabbit + sheep = rabbit with sheep characteristics (the example showed a rabbit with a wool like texture as opposed to rabbit fur texture).
LetterRip t1_iuxas7h wrote
Reply to comment by starstruckmon in [P] Implementation of MagicMix from ByteDance researchers, - New way to interpolate concepts with much more natural, geometric coherency (implemented with Stable Diffusion!) by cloneofsimo
It is prompt editing + prompt interpolation. So N steps of A, M steps of A transitioning to B, and then the remaining steps at B.
LetterRip t1_iusmrac wrote
Reply to [R] Is there any work being done on reduction of training weight vector size but not reducing computational overhead (eg pruning)? by Moose_a_Lini
bitsandbytes LLM int8 you can quantize most weights in large models, and keep a small subset in full range, and get equivalent output. You could then also use a lookup table to further compress the weights.
LetterRip t1_iusm58q wrote
Reply to comment by polandtown in [R] Is there any work being done on reduction of training weight vector size but not reducing computational overhead (eg pruning)? by Moose_a_Lini
pseudo rngs produce deterministic results from a given seed, so aren't truly random. But have a statistical distribution matching true randomness.
LetterRip t1_iu9be41 wrote
Have you tried it with diffusers/stable diffusion?
LetterRip t1_itiqjqp wrote
Reply to comment by sharky6000 in [D] Building the Future of TensorFlow by eparlan
Everyone downloads pytorch directly from the pytorch site, so it is somewhat misleading.
LetterRip t1_itchnjl wrote
Reply to comment by cygn in [R] Scaling Instruction-Finetuned Language Models - Flan-PaLM- Google 2022 - 75.2% on five-shot MMLU / Forecasters expected this SOTA would need until 2024! - Public checkpoints! by Singularian2501
I assume you mean 24GB of VRAM? Deepspeed with enough CPU RAM and mapping to hard drive as needed, might let you run it. Note that 540B parameters is more than 2 TB for float 32. Even going 8 bit, you are looking at 512 GB. Consumer hardware RAM is typically max 128 GB. So the vast majority of it is going to have to be mapped to the hard drive. The size can probably be reduced a lot using both quantization and compression, but you will either have to do the work yourself or wait till someone else does.
LetterRip t1_is5y8z0 wrote
Reply to comment by ZY0M4 in [P] Pure C/C++ port of OpenAI's Whisper by ggerganov
There is a CPU implementation of HIP that runs unmodified HIP code,
LetterRip t1_iryz0a5 wrote
Reply to comment by MohamedRashad in [D] Reversing Image-to-text models to get the prompt by MohamedRashad
unless an image was generated by a specific seed and denoiser, you likely can't actually find a prompt that will generate it since there isn't a 1 to 1 mapping. You can only find 'close' images.
LetterRip t1_irt4luw wrote
Reply to [P] Pure C/C++ port of OpenAI's Whisper by ggerganov
You might check DeepSpeed MII, Facebook AITemplate, and Google XNNPACK and see how their CPU conversions compare.
https://github.com/facebookincubator/AITemplate
https://github.com/microsoft/DeepSpeed-MII
https://github.com/google/XNNPACK
and see how those compare,
also
LetterRip t1_izlburh wrote
Reply to comment by hentieDesu in [P] Using LoRA to efficiently fine-tune diffusion models. Output model less than 4MB, two times faster to train, with better performance. (Again, with Stable Diffusion) by cloneofsimo
Yes you can. I haven't got great results yet, but haven't done a custom model before this.