Submitted by Blutorangensaft t3_10ltyki in MachineLearning

I would like to measure the smoothness of an NLP-autoencoder's latent space. The idea is to sample two Gaussian vectors v1 and v2 in the latent space of the AE, and generate N-1 points between them like so:

vi = v1 + (v2 - v1) / (N * i)

My idea is to then decode these vectors and measure the BLEU score between d(vi) and d(vi+1) for all N-2 comparisons.

Is this idea reasonable, do you have a better one? Is there a technique from AEs with images that can be useful here?

1

Comments

You must log in or register to comment.

jackilion t1_j5zk1sb wrote

I'm not working on NLP but I have seen your idea in papers on diffusion models. You are basically linearly interpolating your latent space. There are other interpolation techniques you could try, but your idea will definitely give you some insight into your latent space.

Another possibiltiy would be some kind of grid search through the latent space, tho depending on your dimensions it could be too hard.

Lastly, you could visualize the latent space by projecting it into 2 or 3 dimensions via t-SNE or something similar.

3

Blutorangensaft OP t1_j5zu9ti wrote

Thank you for your answer. If a paper on diffusion models pops into your mind that uses this method, feel free to post it.

How would you derive a quantitative evaluation from t-SNE? I thought it's mostly used for visualisation. I'm looking to compute some kind of score from the interpolation.

1

crt09 t1_j6317t4 wrote

Just speaking from gut here but you could go the other way around and get sentences with varying BLEU differences, encode them all and see how distance their latent representations are, this way you wouldnt have to worry about the effect of the validity of the generated sentences which might be a problem with the other way around (I think)

1

Blutorangensaft OP t1_j632b2s wrote

Using slightly different sentences to be decoded to the same sentence exists as an idea in the form of denoising autoencoders, yes. I plan to use this down the road, but for now I am interested in thinking about measuring performance.

1

crt09 t1_j633u7c wrote

I think there's miscommunication, it sounds like you think I'm proposing a training method but I'm suggesting how to measure smoothness.

If you have the BLEU distances between input sentences and the distances between their latents, you can see measure how the distances change between the two which I *think* would indicate smoothness. Or you could do some other measurements on the latents to see how smoothly(?) they are distributed? tbh I'm not entirely sure what you mean by smooth, sorry.

If you're looking to measure performance wouldn't that loss for the training method you be mentioned be useful?

Or are you looking for measuring performance on decoding side?

1

Blutorangensaft OP t1_j6344jf wrote

Ahh, I get you now, my apologies. I'm more interested in the performance on the decoding side indeed, because I want to later generate sentences in that latent space with another neural net and have them decoded to normal tokens.

1

Blutorangensaft OP t1_j6356ho wrote

Compare different autoencoders in their ability to create valid language in a continuous space. Later, I want to generate sentences in its latent space by using another neural network, and have them decoded to real sentences by the autoencoder. I want the space to be smooth because the second neural net will naturally be using gradient descent, which involves infinitesimal changes. I believe this network will perform better if the changes that happen actually represent meaningful distances between real sentences.

1

Blutorangensaft OP t1_j654qyd wrote

Thank you for the reference, it looks very promising. I've heard of ways to smooth the latent space through Lipschitz regularisation, but then got disappointed again when I read "ah well it's just layer normalisation". So many things in ML come in a different appearance and actually mean the same thing once you implement them.

1