Submitted by [deleted] t3_11d4ka5 in MachineLearning
tdgros t1_ja71ave wrote
Reply to comment by CellWithoutCulture in [D] Is RL dead/worth researching these days? by [deleted]
>toolformer
Are you sure there's RL in Toolformer? I thought it was mostly self-supervised and fine-tuned.
CellWithoutCulture t1_ja7dklj wrote
> Toolformer
....oh you're right it didn't. I assumed they let it use any tool which would need RL. But it seems like they had pre-labelled ways to use tools.
Thanks for pointing that out.
Viewing a single comment thread. View all comments