Question: Consider a stock that values: 10,00 USD in 2010, 75,00 USD in 2015, 150,00 USD in 2020 and it continues to grow by this day.
Given that decision tree based algorithms like xgboost are generating the tree (splitting the values) based on the ranges, I don’t understand how the tree built on the past data (e.g. years 2000 - 2015) could be in any form applicable for the future price predictions (e.g. years 2015 - 2080).
Could somebody confirm that that feature normalization is truly not required for data that grows beyond the original(/fit/train) range with time?
Do I need to run the raw stock price through some log or sigmoid function before training or is xgboost actually smart enough to deal with this kind of data automatically?
​
edit: to clarify. I have read it everywhere, including the official forums - that feature normalization is not required when training the decision trees model. In my case I am using the xgboost library that uses the gradient boosting decision tree algorithm to train the model but I think that this question is applicable to any other tool that uses the DT based algo.
dearnot t1_it6iueu wrote
Reply to [D] Simple Questions Thread by AutoModerator
Question: Consider a stock that values: 10,00 USD in 2010, 75,00 USD in 2015, 150,00 USD in 2020 and it continues to grow by this day.
Given that
decision tree
based algorithms likexgboost
are generating the tree (splitting the values) based on the ranges, I don’t understand how the tree built on the past data (e.g. years 2000 - 2015) could be in any form applicable for the future price predictions (e.g. years 2015 - 2080).Could somebody confirm that that feature normalization is truly not required for data that grows beyond the original(/fit/train) range with time?
Do I need to run the raw stock price through some log or sigmoid function before training or is
xgboost
actually smart enough to deal with this kind of data automatically?​
edit: to clarify. I have read it everywhere, including the official forums - that feature normalization is not required when training the decision trees model. In my case I am using the
xgboost
library that uses the gradient boosting decision tree algorithm to train the model but I think that this question is applicable to any other tool that uses the DT based algo.