Submitted by Sadness24_7 t3_y72mzl in deeplearning

Hello there,

im working on neural network and i have a lot of input variables which i would like to minimize, based on couple experiments i did i found out that most of the data are corelated. Performing training for each combination for each input size makes no sence since that would take ages. What are the techniques to select variable which have the most inpact on neural network performance ?

3

Comments

You must log in or register to comment.

Prestigious_Boat_386 t1_issl12g wrote

Pca for moderate dimension reduction. Straight up throwing away half of the highly correlated dimensions for very high dimension numbers.

Youd reject the worst dimensions until thw size is low enought to use pda then use pda to reduce to a size your network can handle.

2

Sadness24_7 OP t1_issrioz wrote

I dont think PCA will help me, i need to reduce the number of feature in order to simplify the system im working with. those removed feature will no longer be aquired and thus i cant retrain the model in the future. i need to somehow pick 2-10 features out of 38 for which i can finetune the model and deploy it. only those selected features will be logged for future.

2

Sadness24_7 OP t1_isszoev wrote

But what am i looking for tho. i've been looking at loadings matrix for couple minutes but cant really figure out the connections. Lets say i want to select 7 feature out of 38, so i performa pca for 7 components and im looking at loading matrix (correlation between 38 feature's and 7 pca's . do i just look at the component with best correlation with the input features and the 7 highest correlation with that pca component ?

1

thePedrix t1_ist0fv6 wrote

I can’t be sure that it would work, but I would try this:

-PCA for N components

-Plot a graph with the 2 or 3 first principal components (depending on the cumulative explained variance, if 2 is enough, a 2D plot)

-Plot the magnitude of the variables and see which are the most impactful. Pick the X features you want.

-Train the network with those X features.

1

Stor_bjorn t1_isum684 wrote

Maybe you could try some feature selection for example, tree based feature selection?

1