Generate Rather Than Retrieve: Large Language Models Are Strong Context Generators
This widely cited April 2023 paper presents an approach called generate-then-read (GENREAD) for solving knowledge-intensive tasks.
The main idea is to replace the document retrieval step in the traditional retrieve-then-read pipeline with a document generation step using large language models (LLMs). The authors demonstrate that this approach can achieve state-of-the-art performance on various benchmarks without relying on any external knowledge sources.
Motivation
Knowledge-intensive tasks require access to a large amount of world or domain knowledge.
Traditional retrieve-then-read pipelines suffer from limitations such as noisy and irrelevant information in retrieved documents, shallow interactions between questions and documents, and the need for computationally expensive document indexing and retrieval.
Proposed Method: GENREAD
GENREAD prompts an LLM to generate contextual documents based on a given query, then reads the generated documents to predict the final answer.
The reader can be either a large model (e.g., InstructGPT) used in a zero-shot setting or a small model (e.g., FiD) fine-tuned on the training split of the target dataset.
The authors propose a novel clustering-based prompting method to generate diverse documents covering different perspectives, leading to better recall over acceptable answers.
Experiments and Results
Zero-shot setting: GENREAD with InstructGPT significantly outperforms the original InstructGPT and achieves new state-of-the-art performance on three open-domain QA benchmarks without using any external documents.
Supervised setting: GENREAD with FiD reader achieves better performance than baseline methods on TriviaQA and WebQ, and comparable performance on fact-checking and open-domain dialogue system tasks.
Combining generated documents with retrieved documents further improves the performance, demonstrating their complementarity.
Analysis and Observations
Generated documents contain the correct answer more often than the top retrieved documents, leading to improved QA performance.
Clustering-based prompting method effectively increases the knowledge coverage of generated documents.
Readability analysis shows that generated documents are easier to read and understand compared to retrieved documents when both contain the correct answer.
Limitations and Future Work
The approach relies on LLMs to contain all the knowledge, making it challenging to update knowledge state and adapt to new domains without retraining.
Generated documents might suffer from hallucination errors, resulting in incorrect predictions.
Addressing potential bias and harm that may result from using generated contextual documents is an important direction for future research.
Significance and Impact
The paper introduces a novel perspective on solving knowledge-intensive tasks by leveraging the knowledge stored in LLMs to generate contextual documents instead of retrieving them from external sources.
This approach has the potential to improve the performance and efficiency of various NLP applications, such as open-domain question answering, fact-checking, and dialogue systems.
The proposed GENREAD method demonstrates the effectiveness of using LLMs as strong context generators, opening up new possibilities for developing more advanced and scalable NLP systems.
The clustering-based prompting method introduced in the paper also provides a valuable technique for generating diverse and informative contextual documents.
However, the authors acknowledge the limitations of their approach, such as the difficulty in updating knowledge and adapting to new domains without retraining the LLMs.
Last updated