Viewing a single comment thread. View all comments

dancingnightly t1_je0o082 wrote

The benefit of finetuning or training your own text model (e.g. in the olden days on BERT), now through the OpenAI API vs the benefit of just using contextual semantic search is reducing day-by-day... especially with the extended context window of GPT-4.

If you want something in house, finetuning GPT-J or so could be the way to go, but it's definitely not the career direction I'd take.

2

antonivs t1_je1d8o0 wrote

The training corpus size here is in the multi-TB range, so probably isn't going to work with the OpenAI API currently, from what I understand.

1