Submitted by Devinco001 t3_yzh6v1 in MachineLearning
LetterRip t1_ix0zyfv wrote
what length of texts? sentence? paragraph? page? multiple pages? books?
A sentence might average 10 tokens, a page 750 tokens, a book 225,000 tokens. So 25 million to 562.5 billion tokens.
Devinco001 OP t1_ix2ewbe wrote
Yes, they are short, conversational based. Business intent. Average token length around 10. Total approx 2.5 million sentences
Viewing a single comment thread. View all comments