GitGudOrGetGot t1_ix3s761 wrote on November 20, 2022 at 3:15 PM

Reply to comment by skelly0311 in [D] BERT related questions by Devinco001

>First the Bert model generates word embeddings by tokenizing strings into a pre trained word vector, then you run those embeddings through a transformer for some type of inference

Could you describe this a bit further in terms of inputs and outputs?

I think I get htat you go from a string to a list of individual tokens, but when you say you then feed that into a Pre Trained Word Vector, does that mean you output a list of floating point values representing the document as a single point in high dimensional space?

I thought that's specifically what the transformer does, so not sure what other role it performs here...