Submitted by besabestin t3_10lp3g4 in MachineLearning
visarga t1_j6c0o3e wrote
Reply to comment by golongandprosper in Few questions about scalability of chatGPT [D] by besabestin
I very much doubt they do this in real time. The model is responding too fast for that.
They are probably used for RLHF model alignment: to keep it polite, helpful and harmless, and to generate more samples of tasks being solved by vetting our chatGPT interaction logs, or using the model from the console like us to solve tasks, or effectively writing the answers themselves where the model fails.
Viewing a single comment thread. View all comments