Submitted by ofirpress t3_xvkhz9 in MachineLearning

Self-ask and Self-ask + Google Search

We just put out this preprint that shows that by simply using a new prompt (we call it Self-ask) you can improve the ability of GPT-3 to answer complex questions.

This prompt simply has the model ask (and answer) sub-questions before it answers the main input question.

​

Self-ask with a 1-shot prompt answering a question (using GPT-3)

The format of this prompt also allows for us to automatically parse out the subquestions and have Google answer them instead of GPT-3. This improves performance and allows this system to answer questions that GPT-3 or Google could not answer on their own.

​

Self-ask + Google Search: GPT-3 text in green, Google retrieved text in cyan.

​

Google answers this following question incorrectly:

​

https://preview.redd.it/n98ika5dntr91.png?width=876&format=png&auto=webp&s=0a89508001815d4ef822aa70ae668a20fb88fe46

But Self-ask + Google gets this right:

​

https://preview.redd.it/6nprx9dfntr91.png?width=1090&format=png&auto=webp&s=4df4a5f832e7032a8cb8dfe234472fac6f874558

Our paper has lots more info:

https://ofir.io/self-ask.pdf

The Self-ask + Google Search method is at:

https://github.com/ofirpress/self-ask/blob/main/self-ask_plus_search-engine_demo.ipynb

I'll be here to answer any questions!

73

Comments

You must log in or register to comment.

RoboticAttention t1_ir1fl4p wrote

Very interesting! What other ways of augmenting AI capabilities do you see will follow? Do you think effectiveness of this suggest that symbolic approaches will be connected with NNs, for example an AI agent will be equipped with a separate mathematical theorem checker, or other whiteboard to note down intermediate calculations?

4

ofirpress OP t1_ir1ha5m wrote

Writing down intermediate calculations is not a concept we invented. In our paper we call this 'elicitive prompting' and mention chain-of-thought prompting and the scratchpad papers as previous examples of this.

​

I'm super excited about elicitive prompts (self-ask is in that category too)! I think they're going to enable us to get much more out of these models.

​

And yes, just like we can integrate Google Search we can also integrate lots of other systems, I'm really excited to see how this research direction develops!

8

13ass13ass t1_ir2qa5y wrote

As a hobbyist, I love it! This looks simple enough that I could implement it myself.

Except maybe I’d try changing out the google api with a database connection and have it perform sql queries as intermediate steps.

Could be fun! Eg put my family tree in the database and ask it “was my great uncle alive at the same time as my third cousin twice removed?” Etc.

5

TheReplier t1_ir3qzta wrote

Sorry if I don't completely follow, do you use self ask to finetune the model or do you use it before the model surfaces the result after a prompt? And how does the model come up with the self ask questions specifically?

3

616e696c t1_ir3ybgp wrote

As a lurker in ML field, I was like damn that's awesome.

I can imagine it being used in various domains, after all more questions are better than incorrect or partial answers.

Good Job :D

2

81095 t1_ir433cb wrote

What if Google search has it wrong?

Query: https://www.google.com/search?q=Which+year+was+%22The+Lady+Is+a+Tramp%22+by+Frank+Sinatra+released+for+the+first+time%3F

First result from Wikipedia:

>1937 [WRONG] by Chappell & Co. The song appears in the film version of Babes in Arms (1939 [WRONG]) as an instrumental version only. ... Tony Bennett and Lady Gaga duet. "The Lady Is a Tramp" Released September 20, 2011 [WRONG]

Third result from Cafe Songbook:

>"The Lady Is a Tramp" was, however, added to the movie version of a Rodgers and Hart show for which it wasn't written. The song was interpolated into the 1957 [APPROXIMATELY RIGHT] ...

Discogs link for confirmation: https://www.discogs.com/de/master/544337-Frank-Sinatra-The-Lady-Is-A-Tramp

2

ofirpress OP t1_ir4m7k0 wrote

LaMDA doesn't do multi-hop questions, only singlehop. They have 2 different LMs that talk to each other whereas we have just one. They finetune their model on specially-made data, we just have a prompt.

​

Our approach is inspired by LaMDA and other related amazing previous papers, but our approach is much simpler and easier to implement.

2

jbx028 t1_ir5ck4j wrote

Hi,

This is interesting, but when I tried the same questions you did, GPT3 gave me the correct response without asking any further questions.

Here is an example:

Question: What is the capital of the country where the Taj Mahal is at?
Answer: The Taj Mahal is in India, and the capital of India is New Delhi.

2

ofirpress OP t1_ir9alhm wrote

Yup it answered it correctly by 'talking things through'. Sometimes this happens automatically. A prompt like self-ask makes this happen with much much higher probability.

If you run an empirical evaluation on hundreds of questions you'll see that chain of thought and self-ask get much higher performance than not using a prompt or using a prompt that asks for the answer immediately.

2