Submitted by hopedallas t3_zmaobm in MachineLearning
Far-Butterscotch-436 t1_j0a8ny2 wrote
Reply to comment by biophysninja in [D] Dealing with extremely imbalanced dataset by hopedallas
Regarding 2, there are only 500 features, dimension reduction not needed.
1 and 3 are last resorts
shaner92 t1_j0amnbc wrote
- Has anyone ever seen SMOTE give good results in real world data??
- Depends what the 500 features are, you could very well benefit from dimension reduction, or at least pruning some features, if they are not all equally useful. That is a separate topic though
- Lot of work to create fake data when he already has that amount
Playing with the loss functions/metrics is probably the best way to go as you ( u/Far-Butterscotch-436 ) pointed out.
daavidreddit69 t1_j0b5292 wrote
- I believe not, it's just a concept to me, but not a solving method in general
Viewing a single comment thread. View all comments