Supervised learning isn't the only game in town, and human demonstrations aren't the only kind of data we can collect. For example we can record human preferences over model outputs and then use this data to fine tune models using reinforcement learning (e.g. https://arxiv.org/abs/2203.02155). Even though I'm not a musician, I can still make a meaningful judgement about whether one piece of music is better than another. By analogy, we can use human preferences to train models that are capable of superhuman performance.
Ali_M t1_j2xkhl3 wrote
Reply to comment by groman434 in [Discussion] If ML is based on data generated by humans, can it truly outperform humans? by groman434
Supervised learning isn't the only game in town, and human demonstrations aren't the only kind of data we can collect. For example we can record human preferences over model outputs and then use this data to fine tune models using reinforcement learning (e.g. https://arxiv.org/abs/2203.02155). Even though I'm not a musician, I can still make a meaningful judgement about whether one piece of music is better than another. By analogy, we can use human preferences to train models that are capable of superhuman performance.