Submitted by popcornn1 t3_y7x8vp in MachineLearning

Hello,

I am just starting new project on time series forecasting and consider which library might be the best to use.

In previous project I have been using sktime, but recently I have found modeltime and darts also.

​

So my question is: Have you use any of this library (maybe more then one) and could you guys tell me why you like them and why not?

Thanks in advance for all answers.

25

Comments

You must log in or register to comment.

davidrodord92 t1_isyz1tw wrote

Haven't used all of them but I suggest that instead of focusing on the models focus on the data, review what transformation or cleaning on data could provide better results. Decomposition, decorrelation, log, wavelets etc

If the models use the same algorithm probably results won't change at all. For example Wavelet Toolbox on Matlab and PyWavelets provide same results even though are different teams of developers, but they use the same DWT algorithm.

2

mangotheblackcat89 t1_iszcvo7 wrote

The answer depends on several factors: how many time series you need to forecast? what resources you have available? how much time do you have? what are the business requirements?

I would add to your list any of Nixtla's libraries: statsforecast, neuralforecast, or mlforecast. If your data has a hierarchical structure, you can also try their hierarchical forecast. I've used the first two and they work well, are easy to implement, and relatively fast. They also provide a lot of user support.

Above all, just don't use Meta's Prophet, unless all you care about are nice looking plots.

2

Gere1 t1_it1qipu wrote

Don't miss tuned ARIMA, ETS (e.g. statsmodel). Include a library which has NBEATS, N-Hits (darts, gluonts). Tbh, Darts seems to cover all of them. Maybe DeepAR (gluonts). Most models don't do real multivariate forecasts, though.

Set up an honest evaluation and test all models. You can do some light pre-processing of the data, but don't spend too much time on it.

There aren't any magic tricks. Most methods won't beat a trivial baseline. Predicting the future usually does not work due to a missing predictive signal in the data. How is the model going to know what Musk will twitter tomorrow? The only thing that works is fitting boring seasonality and fitting the effect of holidays etc. You see that neither of these is actually about the future.

Let us know what worked in a critical evaluation in the end!

Here is a write-up https://www.sciencedirect.com/journal/international-journal-of-forecasting/vol/38/issue/4 of https://www.kaggle.com/competitions/m5-forecasting-accuracy . But note that in that competition it was more about fitting holidays and other tricks. There were a lot of zeros in the target. To predict the trend many people used a guessed fudge factor. Also look at the difference between public and private leaderboard to convince yourself that prediction of the top Kagglers in the world seemed to be a noisy mess for that data set. I'm afraid predicting the future isn't solved yet.

6

tblume1992 t1_it2nnb5 wrote

I think the other comments are spot on. It depends on your data. How many time series are you needing to predict for? How 'multivariate' is your data meaning do you have a ton of variables or only a few?

Don't know about modeltime but both darts and sktime are fine. But if you have a lot of good quality variables then it's worth trying boosted trees and 'featurizing' time. If you just have holidays then probably best to stick with time series approaches.

If you also have multiple time series which are related such as products that belong to different categories or something like that -trees may also be worth taking a look at if you pass those categories as variables. Alternatively you could look at hierarchical methods like what is in Nixtla's portfolio of packages.

But definitely give us some more info!

7

Elegant_String4964 t1_it2ucfw wrote

Hi, creator of Darts here. I just wanted to add my two cents on top of what was already said:
* Darts is made in a way such that it's trivial to use simple models (ARIMA, linear regression, etc), which is definitely what you should start with first
* It incorporates most of the models from statsforecast too
* It allows you to use any sklearn-like model (or LGBM, Catboost, XGBoost), featurize the time axis for
* Works seamlessly if your time series are multi-dimensional, or if you have multiple time series (e.g. multiple observations of some signal)
* In the event where you want to try deep learning models, it also has you covered, works on GPUs etc.

Hope this helps.

6

tblume1992 t1_it87xj1 wrote

Hey I know you guys are very open about adding new models (which I would love to add some) but for adding ways to featurize time I don't remember seeing much documentation for that piece, is the process the same? Don't mean to hijack the post just curious!

1