XGBoost Question around One-Hot Encoding & Get_Dummies in Python
I am working on building a model for NHL (hockey) games and have a spreadsheet with a ton of advanced stats from teams, dates they played and so on.
All of my data in this spreadheet is categorized as a float. I am trying to add in a few columns of categorical data as I feel it could help the model.
The categorical columns have data that determines if the home team or the away team is playing on back to back days.
I am trying to determine here is one-hot encoding is best for this approach or if I'm misunderstanding how it works as a whole.
jfacowns t1_j550f70 wrote
Reply to [D] Simple Questions Thread by AutoModerator
XGBoost Question around One-Hot Encoding & Get_Dummies in Python
I am working on building a model for NHL (hockey) games and have a spreadsheet with a ton of advanced stats from teams, dates they played and so on.
All of my data in this spreadheet is categorized as a float. I am trying to add in a few columns of categorical data as I feel it could help the model.
The categorical columns have data that determines if the home team or the away team is playing on back to back days.
I am trying to determine here is one-hot encoding is best for this approach or if I'm misunderstanding how it works as a whole.
Here is some code
Does this make sense? Am i on the right track here?
If i do NHLData.head() I can see the one-hot encoded columns but when I do NHLData.dtypes() I see this:
Should these not be objects?