Comments

You must log in or register to comment.

howard__wolowitz t1_itbmoax wrote

You may look into RetGen by Microsoft. The repo also has model checkpoints linked too for download.

3

bivouac0 t1_itis6yy wrote

There's a Github project that an individual put together based on the RETRO paper. If you checkout the issues list there is some info on work on a pretrained model.

There is also the Huggingface RAG Model and Facebook has a couple of models on the HF hub.

Note that the RAG model is an an older approach to retrieval so you probably want to be looking at the RETRO project above.

2

fastglow t1_ith39gf wrote

Retro doesn't have an open source version at the moment, but doesn't seem like it would be hard to implement, at least on smaller-scale corpora. Dealing with 1.75 trillion-token corpora is a challenge in itself.

1

Seankala t1_itjqp2b wrote

Would open-domain QA models be relevant for this topic?

1

invertedpassion OP t1_itjwg63 wrote

Yes, which ones?

1

Seankala t1_itk390z wrote

Try looking up DensePhrases, it was made by a colleague of mine and may be what you're looking for. They also have an online demo you can try.

I'm not sure what you mean by "retrieval-based language model" though. I don't think there's any language model that's made solely for the purpose of retrieval.

1