Viewing a single comment thread. View all comments

fasttosmile t1_izgxj4n wrote

Careful. There are literally dozens of LMing papers that get an improvement on PTB which do not scale to larger datasets.

3

farmingvillein t1_izi021q wrote

True, but no one has really come up with a better methodology.

The best you can do is train on smaller data + make sure that you can tell yourself a story about how the new technique will still help when data is scaled up (and then hope that you are right).

(The latter is certainly argument for staying at least semi-current with the literature, as it will help you get an intuition for what might scale up and what probably won't.)

2