Submitted by Buck-Nasty t3_10ozflx in singularity
_Just7_ t1_j6kv8fs wrote
Reply to comment by starstruckmon in Chinese Search Giant Baidu to Launch ChatGPT-Style Bot by Buck-Nasty
Hate to be that guy, but source on models in single languages being better? I thought more data = better modeling. Why would it perform worse if you also include the Spanish and Chinese parts of the internet?
starstruckmon t1_j6kygds wrote
I can't really speculate on that topic. It's currently an active area of research.
To be honest, this problem is so widely known that I hadn't considered finding sources to support the claim. Here is the best authoritative source I could quickly find
https://arxiv.org/abs/2012.15613
It may seem counter-intuitive to link to a paper that supposedly fixes this issue, but this is obviously the most likely scenario in which a paper would discuss it. Also, if you read it carefully, you'll see that while the authors managed to reduce the gap, it still persists.
[deleted] t1_j6maw1g wrote
[deleted]
Viewing a single comment thread. View all comments