Submitted by Available_Lion_652 t3_10xu09v in MachineLearning
IntelArtiGen t1_j7uce7z wrote
The CPU bottleneck depends on the model and the training process. If you remove all /most of the preprocessing done on CPU it could be fine. I think transformers don't usually bottleneck on CPU but i7 7700k is quite old.
Available_Lion_652 OP t1_j7ue7pj wrote
My motherboard is quite old and the best CPU that I can attach yo it is a i7 7700k. From what I have read, if I will process the dataset before training, than it should not bottleneck. But what I was think was that the preprocessed dataset is held in 32 GB of RAM. The CPU transfers data from RAM to GPU memory. It has only 8 threads. Let s say I want to train from scratch a GPT2. I do not know exactly how much the CPU/RAM frequency will bottleneck the training process. I fon t want to change my whole hardware. If 3090 RTX is to performant and the bottleneck is to high, I was wondering if I can buy a 3060/3080
pommedeterresautee t1_j7uwa71 wrote
At start the weights will be moved on the GPU. Then during training, the tokenizer will convert your strings to a int64 tensors. They are quite light, and those are moved to GPU during training. What you need is not the fastest CPU but one which can feed your GPU faster that the data it will consume. In GPT2 case, CPU like 7700 won't be an issue. Image or sounds (TTS, ASR) may have more demanding preprocessing during training.
Viewing a single comment thread. View all comments