avialex
avialex t1_janjx6r wrote
Reply to comment by mmmniple in [D] Are Genetic Algorithms Dead? by TobusFire
Appears to be here: https://openreview.net/forum?id=ibNr25jJrf
edit: actually after reading it, I don't think this is the referenced publication, but it's still interesting
avialex t1_j14p22o wrote
Reply to comment by sayoonarachu in [D] Running large language models on a home PC? by Zondartul
There's a VRAM memory leak in that code btw. I haven't tracked it down yet, but it's easy to solve with a torch cache clear in the forward method.
avialex t1_iu6ajry wrote
Reply to [D] DL Practitioners, Do You Use Layer Visualization Tools s.a GradCam in Your Process? by DisWastingMyTime
I use fullgrad religiously, although I've removed the multiplication by the original image so that I'm just seeing the model gradients. I don't really use it to debug, it's more useful as a post-facto indication of what the important features in the data were. Every once in a while I'll see a model is overly focused on corners or something obviously wrong, and that can be an indication of too much instability, but aside from that it's more of an explanatory tool than a debugging tool.
avialex OP t1_irnpnry wrote
Reply to comment by anomalousraccoon in [D] Quantum ML promises massive capabilities, while also demanding enormous training compute. Will it ever be feasible to train fully quantum models? by avialex
Quantum NN's are quantum algorithms, are they not? Are you thinking of hybrid nets where only a few neurons are quantum?
edit: ok I see, you're saying GD is the problem, we need a QC algorithm to train QNN's. I would definitely agree, but as it stands I don't think there is one?
avialex OP t1_irnnm6a wrote
Reply to comment by pmirallesr in [D] Quantum ML promises massive capabilities, while also demanding enormous training compute. Will it ever be feasible to train fully quantum models? by avialex
They certainly are looking, but at the same time gradient calculation is fundamental to how quantum neural networks are implemented right now, and QNN's are a relatively active area of study. I don't think we can dismiss the work in the field as it stands, because it's all built on the foundation of gradient descent. Afaik no one has yet found a better way to train a QNN, even on quantum data. I could be wrong.
avialex t1_ir41bgt wrote
Reply to [R] Self-Programming Artificial Intelligence Using Code-Generating Language Models by Ash3nBlue
"The model is queried to generate modifications of an initial source code snippet. In our experiments, this is a network with a single hidden layer of 16 neurons. The possible modifications include adding convolutional layers, changing the size of convolutional or hidden layers, and increasing the number of hidden layers."
Lmao...
How about those smooth curve lines on graphs with fewer than 10 sample points? That inspires confidence.
avialex t1_jap04wq wrote
Reply to comment by filipposML in [D] Are Genetic Algorithms Dead? by TobusFire
I was kinda excited, I had hoped to find an evolutionary algorithm to find things in a latent space, I've been having a hell of a time trying to optimize text encodings for diffusion models.