Viewing a single comment thread. View all comments

chimp73 t1_j2vw251 wrote

I made a summary of the related work section with some help from ChatGPT:

> Pruning has been applied to smaller models, but has not been studied in large models like GPT with over 10 billion parameters. Previous pruning methods have required retraining the model after pruning, which is time-consuming and resource-intensive for large models like GPT. SparseGPT has been developed for pruning large GPT models without retraining. There has been significant research on post-training methods for quantizing GPT-scale models, which involve reducing the precision of the weights and activations in the model to reduce memory and computational requirements. The SparseGPT method can be used in conjunction with these quantization methods to further compress the model.

3