teenaxta

teenaxta t1_j8qvnx0 wrote

I think this has more to do with probability, the sum of all random variables approaches a gaussian distribution. We can prove it using Central limit theorem. So what that really means is that the noise can map all sorts of information. Also when you add noise consistently, at one point you reach the normal distribution however, the noise pattern at hand is unique. Think of it as this way, 0,0 have a mean of 0 while -1,1 also have a mean of 0. The unique noise pattern actually contains useful information where as if you were to create a blank canvas, your generator would have no idea about what to generate from it for it is a many to one mapping. The additive noise process is a unique mapping

1

teenaxta t1_j62mz4o wrote

Customer ID is useless so obviously it will be dropped. Now the actions he did is a bit tricky.

if actions are discrete classes, then i think you should break up the column into sub classes and then one hot encode the actions.

I cant really understand why you need LSTM here. Do you have a sequence data or any sort of temporal component ? If you have to use LSTM you can just set your sequence length to 1 and essentially use it as a NN. But that makes no sense honestly. Would be much better to use something like XGboost

3