Submitted by juliensalinas t3_11tqryd in MachineLearning
juliensalinas OP t1_jcky8ok wrote
Reply to comment by Necessary_Ad_9800 in [D] An Instruct Version Of GPT-J Using Stanford Alpaca's Dataset by juliensalinas
You're welcome.
A token is a unique entity that can either be a small word, part of a word, or punctuation.
On average, 1 token is made up of 4 characters, and 100 tokens are roughly equivalent to 75 words.
Natural Language Processing models need to turn your text into tokens in order to process it.
Viewing a single comment thread. View all comments