w00t_loves_you t1_j03qps0 wrote on December 13, 2022 at 9:23 PM

Reply to comment by ReginaldIII in [Project] Run and fine-tune BLOOM-176B at home using a peer-to-peer network by hx-zero

Would it be possible to repeat the same training tasks on multiple workers and verify the workers against each other?

OTOH it's more work to create a malicious worker than creating a malicious free LM, no?

ReginaldIII t1_j03sbkj wrote on December 13, 2022 at 9:33 PM

> Would it be possible to repeat the same training tasks on multiple workers and verify the workers against each other?

That's what I meant here.

>> A nice benefit of building on kafka is that multiple consumers looking at a queue can consume the same messages such that you can get voting by consensus for what the results to be passed on should be.

> OTOH it's more work to create a malicious worker than creating a malicious free LM, no?

Different types of malicious. A malicious worker could leak data it's passed off to someone else or it could work to destabilize the training limiting final accuracy or causing overfits.

If you are a company brokering access to privately trained LLM's and you have the opportunity to prevent a crowd sourced LLM reaching as good quality as your own there could exist an incentive to harm that effort. Corporate espionage is a thing.

There are plenty of ways in which a crowd-computing effort could be misused or attacked.