Viewing a single comment thread. View all comments

pommedeterresautee t1_j7uwa71 wrote

At start the weights will be moved on the GPU. Then during training, the tokenizer will convert your strings to a int64 tensors. They are quite light, and those are moved to GPU during training. What you need is not the fastest CPU but one which can feed your GPU faster that the data it will consume. In GPT2 case, CPU like 7700 won't be an issue. Image or sounds (TTS, ASR) may have more demanding preprocessing during training.

5