Viewing a single comment thread. View all comments

SlayahhEUW t1_iya6l1f wrote on November 29, 2022 at 10:02 PM

Reply to comment by literum in [r] The Singular Value Decompositions of Transformer Weight Matrices are Highly Interpretable - LessWrong by visarga

Roughly it's the base mammalian feature extractors. This can also be found by performing principal component analysis of the data, the first layer of a CNN will after training have the same representation as the PCA of the data.