Submitted by TobusFire t3_11fil25 in MachineLearning
M_Alani t1_jakapj2 wrote
Oh brings back a lot of memories. I remember using it in the early 2000s to optimize neural networks. Back when only Matlab was there and we couldn't afford it and had to build NN from scratch.... using Visual Basic 😢
Back to your question, I don't think they're dead. Probably their use in NN is. Edit:spelling
filipposML t1_jamongz wrote
We recently published an evolutionary method to sample from the latent space of a variational autoencoder. It is still alive and well. Just a bit niche.
mmmniple t1_jan7i1a wrote
It sounds very interesting. Is it available to read? Thanks
filipposML t1_jaopq43 wrote
The latest version is here: https://2022.ecmlpkdd.org/wp-content/uploads/2022/09/sub_1229.pdf
mmmniple t1_jaopv6r wrote
Thanks
filipposML t1_jaq6whb wrote
Cheers
avialex t1_janjx6r wrote
Appears to be here: https://openreview.net/forum?id=ibNr25jJrf
edit: actually after reading it, I don't think this is the referenced publication, but it's still interesting
mmmniple t1_jao9iuk wrote
Thanks
filipposML t1_jaooo3f wrote
Hey, this is it actually! We are optimizing a discrete variational autoencoder with no gumbel-softmax trick.
filipposML t1_jaop5vw wrote
Of course we require no encoding model, so the notion of a latent space only holds up until closer inspection.
avialex t1_jap04wq wrote
I was kinda excited, I had hoped to find an evolutionary algorithm to find things in a latent space, I've been having a hell of a time trying to optimize text encodings for diffusion models.
filipposML t1_jaq6tpq wrote
You just need a notion of a fitness function and then you can apply permutations to the tokens.
[deleted] t1_jalq5f9 wrote
[deleted]
M_Alani t1_jam3i7i wrote
It wasn't as bad as it sounds. The fun part was that you had to understand how every little piece of the algorithm works, and the nightmare was implementing all of this with 512mb of RAM. We didn't have the luxury of trying different solutions.
Downtown_Finance_661 t1_janm2nt wrote
Fun story! How you have chosen hyper-parameters for models? Have you turn them over in for-loops?
M_Alani t1_janmj7j wrote
Mostly. Other times I would interrupt the code when it wasn't converging and start over after changing a parameter or two. I feel si spoiled with Tensorflow now!
proton-man t1_janca53 wrote
It was. Dumb too. Because of the limitations of memory and computing power at the time you had to constantly tweak parameters to optimize learning speed, avoid overfitting, avoid local optimums, etc. Only to find that the best performing model was the one generated by your 2 AM code with the fundamental flaw and the random parameters you chose while high.
M_Alani t1_janluu8 wrote
I can't disagree.
Viewing a single comment thread. View all comments