LetterRip

LetterRip t1_izdm55i wrote

> Glad it worked for you with such small memory constraints!

Currently training image size 768, and accumulation steps=2.

If steps is set to 2000, will it be going to 4000? It didn't stop at 2000 as expected and is currently over 3500, figured I'd wait till over 4000 to kill it in case the accumulation steps acts as a multiplier. (Went to 3718 and quit, right after I wrote the above).

2

LetterRip t1_izdam40 wrote

Just tried this and it ran great on a 6GB VRAM card on a laptop with only 16GB of RAM (barely fit into VRAM - using bitsnbytes and xformers I think). I've only tried the corgi example but seemed to work fine. Trying it with a person now.

9

LetterRip t1_ix0zyfv wrote

what length of texts? sentence? paragraph? page? multiple pages? books?

A sentence might average 10 tokens, a page 750 tokens, a book 225,000 tokens. So 25 million to 562.5 billion tokens.

2

LetterRip t1_iwrvtt5 wrote

It may or may not be fair use. Academic usage is a fair use defense, but it will depend on the specific nature of the usage. What will the trained model be used for? Also is the result transformative? Short version talk to a lawyer.

Also different countries have different copyright laws, so it could be much different if you are not in the US.

1

LetterRip t1_ivk4gfy wrote

> The academic SOTA is to just stick a tabular algorithm on top of some deep net, which is hardly elegant. All these algorithms are just hacks and I wouldn't use them for real money play.

They absolutely crush the best players in the game, and beat less than the best by absurd amounts.

While there are is a huge action space, it turns out that very few bet sizes are needed on early streets (4 is generally adequate), and the final street can be solved on the go.

5

LetterRip t1_iuxy54g wrote

pretty sure 'novel object' means a image that is the combination of multiple objects so for instance - dog + coffee_pot = dog with some characteristics of a coffee_pot (in the image examples the head was short of coffee pot like). rabbit + tiger = rabbit with tiger charactistics. rabbit + sheep = rabbit with sheep characteristics (the example showed a rabbit with a wool like texture as opposed to rabbit fur texture).

1

LetterRip t1_itchnjl wrote

I assume you mean 24GB of VRAM? Deepspeed with enough CPU RAM and mapping to hard drive as needed, might let you run it. Note that 540B parameters is more than 2 TB for float 32. Even going 8 bit, you are looking at 512 GB. Consumer hardware RAM is typically max 128 GB. So the vast majority of it is going to have to be mapped to the hard drive. The size can probably be reduced a lot using both quantization and compression, but you will either have to do the work yourself or wait till someone else does.

5