Hey all! I'm new with deep learning and am asking this to see if anyone has experience or suggestions regarding this.

I'm working with two types of models: model 1 makes fairly good predictions but fluctuates a lot over time, model 2 (LSTM, RNN) has the ability to understand trends more.

Through some experimenting I found that combining the two models can produce decent results as it sort of combines both perks: decent order of value prediction by model 1, decent trend prediction by model 2. An LSTM standalone does not perform that well btw.

One of the inputs for model 2 (the LSTM model) is thus the predicted sequence made by model 1. This means that for training I input a sequence that is already the output... My reasoning was that if I do this, the model will learn not to adjust this sequence.

Therefore, I continued with only using the mean of this sequence and furthermore some synthetic data, for which I reasoned that the model should also learn to adjust the value of the input if necessary.

I kinda did this by just experimenting and I feel like it lacks some proper theory.

About training a deep learning model that inputs a sequence or value that is already correct: do some people know what exactly is theory behind this? What is common practice?

Thanks in advance! I of course can elaborate if needed.

Comments

You must log in or register to comment.

augusts99 OP t1_ja3n6aa wrote on February 26, 2023 at 4:36 PM

#2,027,328

Perhaps I should elaborate that the predicted sequence made by model 1 is not the only sequence of the LSTM model. I also use different variable sequences for which I hope the LSTM uses these to understand the correct trends.

thehallmarkcard t1_ja5shr4 wrote on February 27, 2023 at 1:25 AM

#2,041,630

Am I understanding correctly that you train one model from input features to output minimizing the error to the true output then take the predictions of this first model and feed it into the RNN with other features and again minimize the loss to the true output?

augusts99 OP t1_ja5y2rt wrote on February 27, 2023 at 2:09 AM

#2,042,826

Replying to thehallmarkcard (#2,041,630)

Yeah! Currently what I do is that Model 1 makes predictions based on certain input features making predictions timestep for timestep. The LSTM model uses the predicted sequence together with other variable sequences to make the predictions more robust and stable, as well as more making it have more correct trends. Atleast, that is the idea.

thehallmarkcard t1_ja604kn wrote on February 27, 2023 at 2:25 AM

#2,043,286

Replying to augusts99 (#2,042,826)

So with no other info on your methodology I can’t think of any issue with this. In some sense you’re RNN may be modeling the trend component and the other model measuring the volatility. But that’s hard to say not knowing any more. I am curious if you tried stacking the models directly such that the weights optimize through both models simultaneously. But that depends what kind of models you have and isn’t necessarily better just different.

augusts99 OP t1_ja7jo91 wrote on February 27, 2023 at 12:53 PM

#2,053,880

Replying to thehallmarkcard (#2,043,286)

Okay thank you for the feedback! That could be interesting! Model 1 is a Random Forest model and uses different input than the LSTM, and at the moment I think for my skill level it may be a too big of a hassle to make the models predict simultaneously. Or what is meant with stacking the models if I may ask?

FunBit9789 t1_ja8aim7 wrote on February 27, 2023 at 4:17 PM

#2,060,526

Replying to augusts99 (#2,053,880)

Ah ok so if it were also a NN you could have outputs feed directly from model to the other with multiple heads but with a random forest as the first model my suggestion doesn’t really make sense.