Fine-Tuning or Retrieval?
Microsoft, Israel
This January 2024 paper investigates the relationship between Retrieval-Augmented Generation (RAG) and fine-tuning for knowledge injection in Large Language Models (LLMs)
Key Points
RAG consistently outperforms fine-tuning in knowledge-intensive tasks, both for existing knowledge encountered during training and entirely new knowledge.
Fine-tuning can improve results compared to the base model in most cases, but it is not competitive with the RAG approach.
RAG not only adds knowledge to a model but also incorporates context relevant to the question, a feature lacking in fine-tuning.
Fine-tuning may impact other capabilities of the model due to a degree of catastrophic forgetting.
Unsupervised fine-tuned models might benefit from further alignment through supervised or reinforcement learning (RL)-based fine-tuning.
In some cases, using the fine-tuned model instead of the base model as the generator in the RAG pipeline improved results even further. However, this is not consistent and demonstrates the inherent instability of fine-tuning.
Best practices for combining RAG and fine-tuning
Use RAG as the primary method for knowledge injection, as it consistently outperforms fine-tuning.
Consider using the fine-tuned model as the generator in the RAG pipeline, but be aware of the potential instability.
Explore combinations of various fine-tuning techniques, such as unsupervised, instruction-tuning, or RL-based methods, with diverse auxiliary knowledge bases to potentially yield improved results.
When teaching LLMs new knowledge through fine-tuning, ensure that the knowledge is repeated in numerous ways, such as using paraphrase augmentation. This can increase the model's ability to comprehend and generalize new knowledge from limited data.
Optimise all relevant hyperparameters for specific cases, as the choice of hyperparameters significantly impacts the results.
Test the generalisation of the findings to other LLMs thoroughly, as the results may vary depending on the model's capabilities and characteristics.
Evaluate the knowledge injection methods using various datasets and sources, as the choice of dataset may influence the results.
In conclusion, while fine-tuning can be useful for many use cases, RAG is a more reliable choice for knowledge injection in LLMs.
However, combining RAG with fine-tuning techniques and following best practices can potentially enhance the model's performance and ability to adapt to new knowledge.
Last updated