Submitted by zanzagaes2 t3_10xt36j in MachineLearning
Hi all: I have trained a CNN (efficietnet-b3) to classify the degree of a disease on medical images. I would like to create an embedding both to visualize relationships between images (after projecting to 2d or 3d-space) and to find similar images to one given.
I have tried using the output of the last convolution both before and after pooling for all train images (~30.000) but the result is mediocre: images non-alike are quite close in the embedding and plotting it in 2 or 3d just show a point cloud with no obvious pattern.
I have also tried to use the class activation map (the output of the convolutional layer after pooling and multiplying by the weights of the classifier of the predicted class). This is quite better, but class are not separated too clearly in the scatterplot.
Is there any other sensible way to generate the embeddings? I have tried using the hidden representation of earlier convolutional layers, but some of them are so huge (~650.000 features per sample) creating a reasonable sized embedding would require very aggressive PCA.
​
Example of the scatter plot of the heatmap embedding. While it is okayish (classes are more or less spatially localized) it would be great to find an embedding that creates more visible clusters for each class.
​
Tober447 t1_j7u90qp wrote
You could try an autoencoder with CNN layers and a bottleneck of 2 or 3 neurons to be able to visualize these embeddings. The autoencoder can be interpreted as non-linear PCA.
​
Also, similarity in this embedding space should correlate with similarity of the real images/whatever your CNN extracts from the real images.