fasttosmile t1_izgxj4n wrote on December 9, 2022 at 12:50 AM

Reply to comment by SeucheAchat9115 in [D] Workflows for quickly iterating over ideas without free access to super computers by [deleted]

Careful. There are literally dozens of LMing papers that get an improvement on PTB which do not scale to larger datasets.

farmingvillein t1_izi021q wrote on December 9, 2022 at 6:21 AM

True, but no one has really come up with a better methodology.

The best you can do is train on smaller data + make sure that you can tell yourself a story about how the new technique will still help when data is scaled up (and then hope that you are right).

(The latter is certainly argument for staying at least semi-current with the literature, as it will help you get an intuition for what might scale up and what probably won't.)