jimmymvp t1_j4fcjly wrote on January 15, 2023 at 8:37 AM

Reply to comment by chaosmosis in Why is Super Learning / Stacking used rather rarely in practice? [D] by Worth-Advance-1232

Hm, I'm not sure about that. There's the mixture of experts idea that does not exactly stacking, but rather specializes multiple models to parts of the data so each data point gets assigned to a specific shallow model. What you need then is an assignment rule, mostly done by a classifier and it's been shown that this is cheaper in terms of compute at evaluation time. I'm not sure if the idea is abandoned by now, but Google Brain published a paper on this and there were subsequent works.

chaosmosis t1_j4fpg6g wrote on January 15, 2023 at 11:30 AM

I'd love the reference if you can find it.

jimmymvp t1_j4hy6xm wrote on January 15, 2023 at 9:19 PM

https://ai.googleblog.com/2022/11/mixture-of-experts-with-expert-choice.html?m=1

https://ai.googleblog.com/2022/01/scaling-vision-with-sparse-mixture-of.html?m=1

chaosmosis t1_j4lnsfe wrote on January 16, 2023 at 4:29 PM

Thanks!