YOLOBOT666 t1_j7wrm1z wrote on February 9, 2023 at 11:14 PM

Reply to [D] RTX 3090 with i7 7700k, training bottleneck by Available_Lion_652

What about saving the dataset into batches as individual files, then use the data loader to load the files as batches for transformers? Keeping the batch size reasonable for the GPU memory.

For any preprocessing/scaling, this could be done on the CPU side and would not consume much memory^

YOLOBOT666 t1_j7iov1k wrote on February 7, 2023 at 1:54 AM

Reply to comment by mostlyhydrogen in [D] Querying with multiple vectors during embedding nearest neighbor search? by mostlyhydrogen

Nice! I guess the heuristic part is how you use the queries at every iteration and make it “usable” in your iterative approach. What’s the size and dimension of your dataset? These graph-based ANNs are memory intensive, wondering what can you do for your dimensions?

If it’s a public repo/planning to release it on GitHub, I’d be happy to join!

YOLOBOT666 t1_j75i86x wrote on February 4, 2023 at 5:34 AM

Reply to comment by mostlyhydrogen in [D] Querying with multiple vectors during embedding nearest neighbor search? by mostlyhydrogen

Out of curiosity, what are you trying to achieve as in when is the iterative process going to stop, what would be the heuristics? Would appreciate if you could share some papers for this!

YOLOBOT666 t1_j72cncj wrote on February 3, 2023 at 4:01 PM

Reply to [D] Querying with multiple vectors during embedding nearest neighbor search? by mostlyhydrogen

Iterative as in continuing until there’s no more neighbours left as you continuously add neighbours to your index and query?

YOLOBOT666 t1_j6ziz4m wrote on February 3, 2023 at 12:06 AM

Reply to comment by fuscarili in [D] I'm at a crossroads: Bayesian methods VS Reinforcement Learning, which to choose? by fuscarili

Yeah, this would be a course in RL, most likely using RL bible as main reference textbook. Agree with the other comment, these lectures are all available online.

What I found valuable in attending a course in person was the prof, lots of insights and intuitions explained in person/office hours was the most valuable part for me. While I was taking the RL course in person, I also referenced online lectures and notes.

In terms of data science interviews and jobs, Bayesian would be more useful, at least more than RL unless you found yourself in robotics or some very niche industry.

YOLOBOT666 t1_j6z5gjn wrote on February 2, 2023 at 10:32 PM

Reply to [D] I'm at a crossroads: Bayesian methods VS Reinforcement Learning, which to choose? by fuscarili

Depends on the RL course content, if it’s just following along the RL bible, then you could do it yourself. Checkout the syllabus/slides of previous years to get an idea. The assignments/projects is where you learn the most IMO, especially for RL.