Submitted by Tigmib t3_10awo8f in MachineLearning
janpf t1_j4ai7ia wrote
If you use synthetic data (from the crop simulation models), the model will kind of reverse-engineer it (it will learn what the simulation models are doing).
Using a mix of it with real word data, is like regularizing your model (adding a prior) to the simulation rules.
This is something that makes sense, and mixing data often is used. But "making sense" doesn't necessarily means it helps ... that depends a lot on your application. Also the next question is how much synthetic data you may want to mix ... fundamentally you'll have to figure it out by trial&error and having some way of measuring if things are getting better for whatever your extrinsic goal is (your business objective).
Tigmib OP t1_j4bezs5 wrote
Yes thats true. This is also what I thought about. Using a mixed dataset or transfer learning approaches (first train on synthetic data, then retrain on real world) should incorporate the domain knowledge. But you are right, right know thats just an hypothesis...but I will test it!
Viewing a single comment thread. View all comments