Seankala
Seankala t1_is8qtf8 wrote
Reply to comment by [deleted] in [D] Manually creating the target data is considered as data leakage. by [deleted]
Let me ask you then, why would you think this is cheating?
Seankala t1_is8nrdc wrote
As long as you're not using the target data for training then you're fine.
Seankala t1_irzxiq8 wrote
Depends on what I'll be working on and who I'll be working with. I know plenty of brilliant people who work for big name companies that are complete asshats that I would never work with no matter how much money I was offered (unless it was 7 figures). If the startup is working on something I believe in and the people are likable, I'd choose the startup.
Seankala t1_iruw37k wrote
Can't speak on behalf of Keras, but for PyTorch's implementation of the cross entropy loss the softmax is calculated with the loss function. Therefore, you'd feed unscaled logits into the loss function.
Seankala t1_iruvc69 wrote
I work at a startup that provides solutions to exactly this problem. It's just multi-label classification, you might want to look more into that.
Seankala t1_iru9kdm wrote
Reply to comment by Empty-Painter-3868 in [D] What are your thoughts about weak supervision? by ratatouille_artist
I second forgetting about Snorkel and the like. I found it better for me to just label the datapoints myself and continuously refine pseudo labels generated by models.
Seankala t1_is8rc1t wrote
Reply to comment by [deleted] in [D] Manually creating the target data is considered as data leakage. by [deleted]
The necessity of ML for your scenario has nothing to do with your question...