Submitted by hx-zero t3_zl03b0 in MachineLearning
bacteriarealite t1_j04xqbi wrote
Reply to comment by ReginaldIII in [Project] Run and fine-tune BLOOM-176B at home using a peer-to-peer network by hx-zero
Blockchain technology would absolutely accomplish the issue of trusting your workers. Why else do people invest millions in mining rigs? Because of a system of decentralized trust built on the blockchain, where they won’t gain any benefit from trying to create fake/malicious blocks. It would both incentivize people to donate their resources and create a cryptographically secured system so you can trust the results you are getting. You don’t need everyone on chain to rerun the analysis, that’s just one form of validation. All you need is a system of nodes that trust other local nodes through periodic validation. It may require more resources but you’ve solved both the issue of trust and incentivizing work which will negate any increased burden that the periodic repeat validation requires.
ReginaldIII t1_j050qmd wrote
Explain to me the mechanism by which you would encode the "correctness" of a result as a transaction or even smart contract on an idealized blockchain.
> Blockchain technology would absolutely accomplish the issue of trusting your workers. Why else do people invest millions in mining rigs? Because of a system of decentralized trust built on the blockchain, where they won’t gain any benefit from trying to create fake/malicious blocks.
We are talking about "trusting" fundamentally different things. A blockchain would be able to encode that at a specific point in time a worker going by some name returned something. It would be immutably stored in the blockchain, such that in the future we can look back and say "Yes, at that specific point in time a worker going by that name returned something".
And that tells us nothing about whether that worker returned the "correct" result, or a manipulated one.
I am talking about where the worker has returned the value that it proposes is the result and we care about having a mechanism to trust that the value itself is "correct" and therefore the worker has, at least this time, acted in a trustworthy fashion.
So if I am missing something, please, explain to me the mechanism by which you would encode the "correctness" of a set of activations and gradients for a chunk of work on a blockchain?
bacteriarealite t1_j052zlu wrote
> And that tells us nothing about whether that worker returned the "correct" result, or a manipulated one.
It actually does. A worker that creates a fake block would need a consensus of the nodes on the chain to verify that block to get it added. That’s precisely how blockchain technology creates trust - it has nothing to do with the ledger being public, but is about having a consensus of nodes verifying the cryptographic signature on a block before adding it to the chain and then growing along that consensus chain so that eventually it’s computationally insurmountable to reverse the direction of the chain back to the fake block you are trying to create.
The most obvious solution with respect to our discussion here is requiring that every node validates the finding. Easy to understand how that system could create a trusted system but obviously it’s useless. So alternatively could make validators just validate that your in a local minimum while the original validation tries to evaluate a more global feature space. Or an alternative option like I said before is to have more local validation sectors where you trust people in your local network because of confirmed results.
And I’m not trying to say I have the solution here, but I think it’s pretty obvious that blockchain technology solves these problems with just some tinkering around with the mechanisms of consensus’s and chain building.
ReginaldIII t1_j053c4i wrote
I earnestly believe it solves problems that contain similar words. But it just does not present a practical solution to this problem.
We can't put the returned values on the blockchain. It just isn't possible to store them they are too big and too many, and there is no reason to store them, we only want to pass them onto the next worker or workers that immediately need them. We do care about fault tolerance to make sure they get to their destination.
So there's no way for this pool of blockchain nodes to form a consensus over the returned values being "correct" like this. We can't put the relevant information on the blockchain to allow it be compared.
What you end up with is just a classic non-blockchain vote by agreement system between workers of unknown trustworthiness. No blockchain needed.
You are correct that voting by consensus is needed, you just don't need all the rest of the things that turn that into a blockchain.
[deleted] t1_j054kbq wrote
[deleted]
ReginaldIII t1_j054n38 wrote
Please read my updated comment.
> I think the use case here is pretty obvious
With the greatest of respect, I don't.
> and I tried to just give some basic examples but I’m certainly not an expert and have not been involved in the types of trouble shooting required to get something like this working.
Also with the greatest of respect, I am an expert in this area, and have also worked with blockchains extensively.
I do not think blockchain is a "stream of buzzwords". I think it is the wrong tool to solve "this" problem.
kaibee t1_j05omff wrote
> I earnestly believe it solves problems that contain similar words. But it just does not present a practical solution to this problem. > > > > We can't put the returned values on the blockchain. It just isn't possible to store them they are too big and too many, and there is no reason to store them, we only want to pass them onto the next worker or workers that immediately need them. We do care about fault tolerance to make sure they get to their destination. > > > > So there's no way for this pool of blockchain nodes to form a consensus over the returned values being "correct" like this. We can't put the relevant information on the blockchain to allow it be compared. > > > > What you end up with is just a classic non-blockchain vote by agreement system between workers of unknown trustworthiness. No blockchain needed. > > > > You are correct that voting by consensus is needed, you just don't need all the rest of the things that turn that into a blockchain.
This is basically solving a very similar problem. https://rendertoken.com/#intro
ReginaldIII t1_j06mqeo wrote
In rendertoken's scenario we don't have a requirement on high throughput of one job feeding into another.
The individual units of work are expensive and long lived. Rendering a frame of a film takes roughly the same amount of time it did a few years ago, we just get higher fidelity output for that same render budget. All the frames can be processed lazily by the compute farm, and the results just go into a pool for later collection.
Because the collation of the results happens in a more offline fashion from the actual computation, you have time and resources to encode the results on a blockchain. Auditing that your requested work was processed is a desirable quality, and so a blockchain does provide a benefit.
In the case of distributed model training the scenario is different. We have high throughput of comparatively small chunks of work. Other than passing the results to the next immediate worker to do the next part of the computation, we have no desire (or storage capacity) to keep any of the intermediate results. Because we have high throughput of many small chunks a blockchain encoding these chunks would need a small proof of work and so would not be a reliable source of truth anyway.
Then consider that we don't even care about having an audit trail to prove historical chunks really were processed when we think they were. We only care about checking results are valid on the fly as we are doing the compute.
We just need a vote by agreement on the immediate results so they can be handed off to the next workers. Yes blockchains often have a vote by agreement part to how they decide what the actual state of the blockchain is, but we just need that part. We don't actually need the blockchain itself.
bacteriarealite t1_j0561xo wrote
You don’t need to put any model information on the blockchain. The goal of the blockchain is that it’s creating a network of trust tied to computational work. All we want from the blockchain is to be able to say we trust this node and it’s providing work. There are many ways you can then go about setting that up and debate the nuanced details of what will work best. But the utility of blockchain is pretty simple - we want decentralized work, we want decentralized trust. Blockchain is the only technology that does that. In fact blockchain is really just a synonym for those two things. So on your first post when you questioned how we would be able to trust this decentralized work, the answer to that is simple - blockchain. The details past that have million dollar answers but the underlying principle is pretty straight forward.
ReginaldIII t1_j056gyk wrote
Okay. We are going in circles now, and I've responded to these points at length.
The burden on you is to now flesh this idea out and show it can work in practice for this problem.
I will not be pursuing this avenue.
Good evening.
bacteriarealite t1_j057m7a wrote
The burden on me was just to point out that when someone is looking for a way to create decentralized trust in a system of decentralized work that the solution is blockchain. My point in the first comment wasn’t to hash out all the nuanced details of how that would work in practice, I was just pointing out that if your looking to create trust in a decentralized system then the best (and honestly only) way to do that is blockchain.
Viewing a single comment thread. View all comments