Viewing a single comment thread. View all comments

TheBrain85 t1_iymw874 wrote

Pretty biased selection method: the best ensemble in the M4 competition, evaluated on the M3 competition. Although I'm not familiar with these datasets, they're from the same author, so presumably they have significant overlap and similarity. The real question is how hard is it to find such an ensemble without overfitting to the dataset.

3

SherbertTiny2366 t1_iynjxon wrote

How is it biased to try good-performing ensembles in another data set?

And how is that overfitting?

Furthermore, just because the data sets begin with "M" it does not mean that they "have significant overlap and similarity. "

0

TheBrain85 t1_iyp1qrz wrote

Because if there's overlap in the datasets, or they contain similar data, the exact ensemble you use is essentially an optimized hyperparameter specific for the dataset. It is exactly the reason that for any hyperparameter optimization cross-validation is used on a set separate from the test set. So using the results on the M4 dataset is akin to optimizing hyperparameters on the test set, which is a form of overfitting.

The datasets are from the same author, same series of competitions: https://en.wikipedia.org/wiki/Makridakis_Competitions#Fourth_competition,_started_on_January_1,_2018,_ended_on_May_31,_2018

"The M4 extended and replicated the results of the previous three competitions"

3

WikiSummarizerBot t1_iyp1s69 wrote

Makridakis Competitions

Fourth competition, started on January 1, 2018, ended on May 31, 2018

>The fourth competition, M4, was announced in November 2017. The competition started on January 1, 2018 and ended on May 31, 2018. Initial results were published in the International Journal of Forecasting on June 21, 2018. The M4 extended and replicated the results of the previous three competitions, using an extended and diverse set of time series to identify the most accurate forecasting method(s) for different types of predictions.

^([ )^(F.A.Q)^( | )^(Opt Out)^( | )^(Opt Out Of Subreddit)^( | )^(GitHub)^( ] Downvote to remove | v1.5)

2

SherbertTiny2366 t1_iyq6prw wrote

There is no overlap at all. It’s a completely new dataset. There might be similarities in the sense that there are time series or certain frequencies but in no way could it be the talk of “training in the test” set.

1