SlayahhEUW t1_iya6l1f wrote
Reply to comment by literum in [r] The Singular Value Decompositions of Transformer Weight Matrices are Highly Interpretable - LessWrong by visarga
Roughly it's the base mammalian feature extractors. This can also be found by performing principal component analysis of the data, the first layer of a CNN will after training have the same representation as the PCA of the data.
Viewing a single comment thread. View all comments