LetterRip t1_izlburh wrote on December 9, 2022 at 11:02 PM

Reply to comment by hentieDesu in [P] Using LoRA to efficiently fine-tune diffusion models. Output model less than 4MB, two times faster to train, with better performance. (Again, with Stable Diffusion) by cloneofsimo

Yes you can. I haven't got great results yet, but haven't done a custom model before this.

LetterRip t1_izksf4k wrote on December 9, 2022 at 8:51 PM

Reply to comment by Teotz in [P] Using LoRA to efficiently fine-tune diffusion models. Output model less than 4MB, two times faster to train, with better performance. (Again, with Stable Diffusion) by cloneofsimo

It is working, but I need to use prior preservation loss, otherwise all of the words in the phrase have the concept bleed into them. So generating photos for preservation loss now.

LetterRip t1_izdm55i wrote on December 8, 2022 at 9:24 AM

Reply to comment by cloneofsimo in [P] Using LoRA to efficiently fine-tune diffusion models. Output model less than 4MB, two times faster to train, with better performance. (Again, with Stable Diffusion) by cloneofsimo

> Glad it worked for you with such small memory constraints!

Currently training image size 768, and accumulation steps=2.

If steps is set to 2000, will it be going to 4000? It didn't stop at 2000 as expected and is currently over 3500, figured I'd wait till over 4000 to kill it in case the accumulation steps acts as a multiplier. (Went to 3718 and quit, right after I wrote the above).

LetterRip t1_izdam40 wrote on December 8, 2022 at 6:44 AM

Reply to [P] Using LoRA to efficiently fine-tune diffusion models. Output model less than 4MB, two times faster to train, with better performance. (Again, with Stable Diffusion) by cloneofsimo

Just tried this and it ran great on a 6GB VRAM card on a laptop with only 16GB of RAM (barely fit into VRAM - using bitsnbytes and xformers I think). I've only tried the corgi example but seemed to work fine. Trying it with a person now.

LetterRip t1_izb2gbt wrote on December 7, 2022 at 8:16 PM

Reply to [D] Can AI Music Tools Compete with Artists? by Oblipher

For classical, twosetviolin couldn't tell the difference between AI and human composers,

https://www.youtube.com/watch?v=PmL31mVx0XA

LetterRip t1_iz20yak wrote on December 5, 2022 at 9:54 PM

Reply to [D] Is there an affordable way to host a diffusers Stable Diffusion model publicly on the Internet for "real-time"-inference? (CPU or Serverless GPU?) by OkOkPlayer

You might consider uploading it to civitai rather than self hosting, then people can download it and/or make it available via a number of free and paid services.

LetterRip t1_ix0zyfv wrote on November 19, 2022 at 10:23 PM

Reply to [D] BERT related questions by Devinco001

what length of texts? sentence? paragraph? page? multiple pages? books?

A sentence might average 10 tokens, a page 750 tokens, a book 225,000 tokens. So 25 million to 562.5 billion tokens.

LetterRip t1_iwrvtt5 wrote on November 17, 2022 at 9:53 PM

Reply to [D] If I bought a copy of tv series on Youtube (or other platforms), can I use them for training a model? by DarrenTitor

It may or may not be fair use. Academic usage is a fair use defense, but it will depend on the specific nature of the usage. What will the trained model be used for? Also is the result transformative? Short version talk to a lawyer.

Also different countries have different copyright laws, so it could be much different if you are not in the US.

LetterRip t1_iw3rucf wrote on November 12, 2022 at 7:05 PM

Reply to [D]We just release a complete open-source solution for accelerating Stable Diffusion pretraining and fine-tuning! by HPCAI-Tech

Could you provide details on the comparison with DeepSpeed? What parameters were used etc?

Also doesn't it provide any benefit for single GPU inference?

LetterRip t1_ivk4gfy wrote on November 8, 2022 at 3:42 PM

Reply to comment by abstractcontrol in [Project] Rebel Poker AI by Character_Bluejay601

> The academic SOTA is to just stick a tabular algorithm on top of some deep net, which is hardly elegant. All these algorithms are just hacks and I wouldn't use them for real money play.

They absolutely crush the best players in the game, and beat less than the best by absurd amounts.

While there are is a huge action space, it turns out that very few bet sizes are needed on early streets (4 is generally adequate), and the final street can be solved on the go.

LetterRip t1_iuxy54g wrote on November 3, 2022 at 8:40 PM

Reply to comment by meldiwin in [P] Implementation of MagicMix from ByteDance researchers, - New way to interpolate concepts with much more natural, geometric coherency (implemented with Stable Diffusion!) by cloneofsimo

pretty sure 'novel object' means a image that is the combination of multiple objects so for instance - dog + coffee_pot = dog with some characteristics of a coffee_pot (in the image examples the head was short of coffee pot like). rabbit + tiger = rabbit with tiger charactistics. rabbit + sheep = rabbit with sheep characteristics (the example showed a rabbit with a wool like texture as opposed to rabbit fur texture).

LetterRip t1_iuxas7h wrote on November 3, 2022 at 6:12 PM

Reply to comment by starstruckmon in [P] Implementation of MagicMix from ByteDance researchers, - New way to interpolate concepts with much more natural, geometric coherency (implemented with Stable Diffusion!) by cloneofsimo

It is prompt editing + prompt interpolation. So N steps of A, M steps of A transitioning to B, and then the remaining steps at B.

LetterRip t1_iusmrac wrote on November 2, 2022 at 6:50 PM

Reply to [R] Is there any work being done on reduction of training weight vector size but not reducing computational overhead (eg pruning)? by Moose_a_Lini

bitsandbytes LLM int8 you can quantize most weights in large models, and keep a small subset in full range, and get equivalent output. You could then also use a lookup table to further compress the weights.

LetterRip t1_iusm58q wrote on November 2, 2022 at 6:46 PM

Reply to comment by polandtown in [R] Is there any work being done on reduction of training weight vector size but not reducing computational overhead (eg pruning)? by Moose_a_Lini

pseudo rngs produce deterministic results from a given seed, so aren't truly random. But have a statistical distribution matching true randomness.

https://en.wikipedia.org/wiki/Pseudorandom_number_generator

LetterRip t1_iu9be41 wrote on October 29, 2022 at 3:47 PM

Reply to [R] Open source inference acceleration library - voltaML by harishprab

Have you tried it with diffusers/stable diffusion?

LetterRip t1_itiqjqp wrote on October 23, 2022 at 10:58 PM

Reply to comment by sharky6000 in [D] Building the Future of TensorFlow by eparlan

Everyone downloads pytorch directly from the pytorch site, so it is somewhat misleading.

LetterRip t1_itchnjl wrote on October 22, 2022 at 4:19 PM

Reply to comment by cygn in [R] Scaling Instruction-Finetuned Language Models - Flan-PaLM- Google 2022 - 75.2% on five-shot MMLU / Forecasters expected this SOTA would need until 2024! - Public checkpoints! by Singularian2501

I assume you mean 24GB of VRAM? Deepspeed with enough CPU RAM and mapping to hard drive as needed, might let you run it. Note that 540B parameters is more than 2 TB for float 32. Even going 8 bit, you are looking at 512 GB. Consumer hardware RAM is typically max 128 GB. So the vast majority of it is going to have to be mapped to the hard drive. The size can probably be reduced a lot using both quantization and compression, but you will either have to do the work yourself or wait till someone else does.

LetterRip t1_is5y8z0 wrote on October 13, 2022 at 3:22 PM

Reply to comment by ZY0M4 in [P] Pure C/C++ port of OpenAI's Whisper by ggerganov

There is a CPU implementation of HIP that runs unmodified HIP code,

https://github.com/ROCm-Developer-Tools/HIP-CPU

LetterRip t1_iryz0a5 wrote on October 12, 2022 at 2:05 AM

Reply to comment by MohamedRashad in [D] Reversing Image-to-text models to get the prompt by MohamedRashad

unless an image was generated by a specific seed and denoiser, you likely can't actually find a prompt that will generate it since there isn't a 1 to 1 mapping. You can only find 'close' images.

LetterRip t1_irt4luw wrote on October 10, 2022 at 8:57 PM

Reply to [P] Pure C/C++ port of OpenAI's Whisper by ggerganov

You might check DeepSpeed MII, Facebook AITemplate, and Google XNNPACK and see how their CPU conversions compare.

https://github.com/facebookincubator/AITemplate

https://github.com/microsoft/DeepSpeed-MII

https://github.com/google/XNNPACK

and see how those compare,

also