Viewing a single comment thread. View all comments

K3tchM t1_j63l7xu wrote

I don't know which numerical optimization OP is trying to solve, but one major weakness of this paper is that their method requires two solver calls per instance per epoch... Training time might quickly become intractable.

OP should have a look at other methods that aim to solve their problem efficiently, such as https://arxiv.org/abs/2112.03609 or recently https://arxiv.org/abs/2203.16067

1