seba07 t1_itd3x25 wrote on October 22, 2022 at 6:51 PM

A few random things I learned in practice: "Because it worked" is a valid answer to why you chose certain parameters or algorithms. Older architectures like resnet are still state of the art for certain scenarios. Executive time is crucial, we often take the smallest models available.

Apprehensive-Grade81 t1_itgoofb wrote on October 23, 2022 at 2:53 PM

This is a great answer. The simplest approach is often the best in practice.

Furrealpony t1_itdbrky wrote on October 22, 2022 at 7:47 PM

I am still shocked as to how densenets are not the standart in the industry, also with a cross stage partial design you split the gradient flow and allow for much much easier training process. Is it the complexity of implementation thats holding them back I wonder?

Red-Portal t1_itliz8y wrote on October 24, 2022 at 3:14 PM

Anybody who has tried to run densenet knows, that it requires an absurd amount of memory in comparison to resnets.