GasZealousideal8691 OP t1_j4gpu8j wrote
Reply to comment by WigglyHypersurface in [D] Is there any reason hugging face GPT2 would behave (fundamentally) differently from GPT-Neo? by GasZealousideal8691
GPT Neo is GPTNeoForCausalLM, and GPT2 is GPT2LMHeadModel. Like I said, I am not 100% familiar with these, but the huggingface docs listed both as “GPT-neo/GPT2 with an LM head”, so I figured they were analogous.
WigglyHypersurface t1_j4grftr wrote
I think those are the same but make both the causal version and see.
GasZealousideal8691 OP t1_j4gst0f wrote
Dont think there is a causal version for GPT2
WigglyHypersurface t1_j4gzweu wrote
The GPT2 LM is causal. If you do AutoModelForCausalLM with gpt2 it works fine.
Viewing a single comment thread. View all comments