JustOneAvailableName t1_j107lf3 wrote on December 20, 2022 at 6:43 PM

Perhaps something like keep track of harder data points and sample half the batch from that? What happened exactly when training on the hard examples only?

Dartagnjan OP t1_j108k4y wrote on December 20, 2022 at 6:50 PM

That is what I have already done. So far, the loss just oscillates but remains high, which leads me to believe that either I am not training in the right way i.e. maybe the difference between the easy and hard training examples is too drastic to bridge. Or my model is just not capable of handing the harder examples.

JustOneAvailableName t1_j1096lz wrote on December 20, 2022 at 6:53 PM

Sounds like you need a higher batch size. What happens on a plateaued model on the hard examples when you take a huge batch size?