biophysninja t1_j0a4nc2 wrote on December 15, 2022 at 3:52 AM

Reply to [D] Dealing with extremely imbalanced dataset by hopedallas

There are a few ways to approach this depending on the nature of the data, complexity, and compute available.

1- using SMOTE https://towardsdatascience.com/stop-using-smote-to-handle-all-your-imbalanced-data-34403399d3be

2- if your data is sparse you can use PCA or Autoencoders to reduce the dimensionality. Then follow up with SMOTE.

3- Using GANs to generate negatives samples is another alternative.

biophysninja t1_ixfx9yv wrote on November 23, 2022 at 3:27 AM

Reply to [D] What advanced models would you like to see implemented from scratch? by itsstylepoint

Andrej Karpathy has been creating amazing videos on his channel implementing language models from scratch. If you can create videos at the level of accessibility while presenting fundamental concepts, you will make a difference.