ReginaldIII t1_ir9tsc8 wrote
Reply to comment by master3243 in [R] Discovering Faster Matrix Multiplication Algorithms With Reinforcement Learning by EducationalCicada
You're correct, I haven't pointed out anything wrong with the paper conceptually. It appears to work. Their matmul results are legitimate and verifiable. Their JAX benchmarks do produce the expected results.
In exactly the same way AlphaZero and AlphaFold do demonstrably work well. But it's all a bit moot and useless when no one can take this seemingly powerful method and actually apply it.
If they had released the matmul code yesterday people today would already be applying it to other problems and discussing it like we have done with StableDiffusion in recent weeks. But with a massively simplified pipeline to getting results because there's no dataset dependency, only compute, which can just be remedied with longer training times.
master3243 t1_iraj7rp wrote
But the paper was released literally yesterday?!
How did you already conclude that "no one can [...] actually apply it"
No where else in science do we hold such scrutiny and its ridiculous to judge how useful a paper is without at least waiting 1-2 years to see what comes out of it.
ML is currently suffering from the fact that people expect each paper to be a huge leap on its own, that's not how science work or has ever worked. Science is a step by step process, and each paper is expected to be just a single step forward not the entire mile.
ginger_beer_m t1_irb2xdn wrote
The paper was released yesterday, but they had months from the manuscript submission until reviewer acceptance to put up a usable GitHub repo. I guess they didn't bother because .. deepmind.
ReginaldIII t1_irakn7b wrote
> How did you already conclude that "no one can [...] actually apply it"
Because I read the paper and their supplementary docs and realized there's no way anyone could actually implement this given its current description.
> ML is currently suffering from the fact that people expect each paper to be a huge leap on its own,
I don't expect every paper to be a huge leap I expect when a peer reviewed publication is publicly released in NATURE that it is replicable!
master3243 t1_iraqxox wrote
I will repeat the same sentiment, it was released yesterday.
> publicly released in NATURE that it is replicable
It is replicable, they literally have the code.
ReginaldIII t1_irav3ov wrote
So if the paper is ready to be made public. Why not release the code publicly at the same time.
> It is replicable, they literally have the code.
Replicable by the people who have access to the code.
If you are ready to publish the method in Nature you can damn well release the code with it! Good grief, what the fuck are you even advocating for?
master3243 t1_irbx3a6 wrote
What???
I have no idea what you're talking about, their code and contribution is right here https://github.com/deepmind/alphatensor/blob/main/recombination/sota.py
Their contributions are lines 35, 80 88
[deleted] t1_irbz43l wrote
[removed]
master3243 t1_irc4wwf wrote
What are you talking about? They definitely don't need to release that (it would be nice but not required). By that metric almost ALL papers in ML fail to meet that standard. Even the papers that go above and beyond and RELEASE THE FULL MODEL don't meet you're arbitrary standard.
Sure the full code would be nice, but ALL THEY NEED to show us is a PROVABLY CORRECT SOTA matrix multiplication which proves their claim.
Even the most advanced breakthrough in DL (in my opinion) which is Alphafold where we have the full model, doesn't meet your standard since (as far as I know) we don't have the code for training the model.
There are 4 levels of code release
Level 0: No code released
Level 1: Code for the output obtained (only applies to outputs that no human/machine can obtain such as protein folding on previously uncalculated patterns or matrix factorization or solutions to large NP problems that can't be solved using classical techniques)
Level 2: Full final model release
Level 3: Full training code / hyperparameters / everything
In the above scale, as long as a paper achieves Level 1 then it proves that the results are real and we don't need to take their word for it, thus it should be published.
If you want to talk about openness, then sure I would like Level 3 (or even 2).
But the claim that the results aren't replicable is rubbish, this is akin to a mathematician showing you the FULL, provably correct, matrix multiplication algorithm he came up with that beats the SOTA and you claim it's "not reproducible" because you want all the steps he took to reach that algorithm.
The steps taken to reach an algorithm are NOT required to show that an algorithm is provably correct and SOTA.
EDIT: I think you're failing to see the difference between this paper (and similarly alphafold) and papers that claim that they developed a new architecture or a new model that achieves SOTA on a dataset. Because in that case, I'd agree with you, showing us the results is NOT ENOUGH for me to believe that you're algorithm/architecture/model actually does what you claim it does. But in this case, literally the result in itself (i.e. the matrix factorization) is enough for them to prove that claim since that kind of result is impossible to cheat. Imagine I release a groundbreaking paper that says I used DeepLearning to Prove P≠NP and attached a pdf document that has a FULL PROOF that P≠NP (or any other unsolved problem) and it's 100% correct, would I need to also release my model? Would I need to release the code I used to train the model? no! All I need to release for my publication would be the pdf that contains the theorem.
[deleted] t1_irc5eys wrote
[removed]
master3243 t1_irc6to3 wrote
I literally cannot tell if your joking or not!
If I release an algorithm that beats SOTA along with a full and complete proof would I also need to attach all my notes and different intuitions that made me take the decisions I took???????
I can 100% tell you've never worked on publishing improvements to algorithms or math proofs because NO ONE DOES THAT. All they need is 1-the theorem/algorithm and 2-Proof that it's correct/beats SOTA
ReginaldIII t1_irc7nt0 wrote
I'm done.
You only care about the contribution to matmul. Fine.
There's a much bigger contribution to RL being used to solve these types of problems (wider than just matmul). But fine.
Goodbye.
master3243 t1_irdoyzz wrote
> You only care about the contribution to matmul
False, which is why I said it would have been better if they released everything. I definitely personally care more about the model/code/training process than the matmul result.
However, people are not 1 dimensional thinkers, I can simultaneously say that deepmind should release all their recourses AND at the same time say that this work is worthy of a nature publication and aren't missing any critical requirements.
[deleted] t1_ire7cmq wrote
[removed]
Viewing a single comment thread. View all comments