GasZealousideal8691 OP t1_j4g8djf wrote on January 15, 2023 at 2:41 PM

Reply to comment by WigglyHypersurface in [D] Is there any reason hugging face GPT2 would behave (fundamentally) differently from GPT-Neo? by GasZealousideal8691

No, both use the GPT2 tokenizer. GPT-Neo uses GPT2Tokenizer.from_pretrained(‘EleutherAI/gpt-neo-1.3B)”, and GPT2 uses GPT2Tokenizer.from_pretrained(‘gpt2-xl’).

WigglyHypersurface t1_j4gpm5i wrote on January 15, 2023 at 4:42 PM

What kind of head is on the models for the task?

GasZealousideal8691 OP t1_j4gpu8j wrote on January 15, 2023 at 4:44 PM

GPT Neo is GPTNeoForCausalLM, and GPT2 is GPT2LMHeadModel. Like I said, I am not 100% familiar with these, but the huggingface docs listed both as “GPT-neo/GPT2 with an LM head”, so I figured they were analogous.

WigglyHypersurface t1_j4grftr wrote on January 15, 2023 at 4:54 PM

I think those are the same but make both the causal version and see.

GasZealousideal8691 OP t1_j4gst0f wrote on January 15, 2023 at 5:03 PM

Dont think there is a causal version for GPT2

WigglyHypersurface t1_j4gzweu wrote on January 15, 2023 at 5:47 PM

The GPT2 LM is causal. If you do AutoModelForCausalLM with gpt2 it works fine.