Does Fine-Tuning LLMs on New Knowledge Encourage Hallucinations?
Last updated
Copyright Continuum Labs - 2023
Last updated
In this May 2024 paper the authors explore the impact of fine-tuning large language models (LLMs) on new, previously unlearned factual information.
The study focuses on the hypothesis that exposure to such new knowledge during fine-tuning may increase the likelihood of the models generating factually incorrect responses, a phenomenon known as hallucination.
Using a controlled setup with closed-book question answering, the authors vary the proportion of fine-tuning examples that introduce new knowledge and observe the models' performance.
Their findings reveal that while LLMs struggle to learn new factual information through fine-tuning, eventually incorporating this new knowledge increases the models' propensity to hallucinate.
These results underscore the potential risks associated with introducing new knowledge via fine-tuning and suggest that LLMs primarily acquire factual knowledge during pre-training, with fine-tuning enhancing their ability to use this knowledge effectively.
Fine-tuning aligns LLMs with desired behaviours and adapting them to specific downstream tasks.
It allows you to leverage the general knowledge acquired by LLMs during pre-training and tailor it to your specific use case.
Fine-tuning can significantly improve the performance and utility of LLMs for practical applications.
Improved performance on specific tasks compared to using the pre-trained LLM directly.
Ability to adapt the LLM to domain-specific language, terminology, and style.
Opportunity to teach the LLM to follow instructions and exhibit desired behaviours.
Potential to enhance the LLM's capability to utilize its pre-existing knowledge effectively.
When you have a specific downstream task or application that requires the LLM to follow certain instructions or exhibit specific behaviours.
When you need the LLM to adapt to domain-specific language, terminology, or style.
When you want to improve the LLM's performance on a particular task or set of tasks relevant to your use case.
Use high-quality, task-specific data for fine-tuning that aligns with the desired behavior and domain.
Be cautious about introducing new factual knowledge through fine-tuning data, as it may encourage hallucinations. Consider filtering out or re-labelling examples that introduce new facts.
Employ early stopping based on a validation set to mitigate overfitting and reduce the risk of hallucinations.
Carefully select the fine-tuning examples to include a mix of HighlyKnown and MaybeKnown examples, as they are essential for the LLM to use its pre-existing knowledge effectively.
Note: The paper provides evidence that fine-tuning works well when the fine-tuning dataset consists primarily of examples that are known to the pre-trained LLM (referred to as "Known" examples in the paper). The authors demonstrate that fine-tuning on a dataset with a higher proportion of "Known" examples leads to better performance on a held-out test set. Conversely, fine-tuning on a dataset with a higher proportion of examples containing new knowledge that the LLM was not exposed to during pre-training (referred to as "Unknown" examples) results in decreased performance and a higher tendency for the model to hallucinate.
The authors demonstrated this by categorising the "Known" examples into three subcategories: HighlyKnown, MaybeKnown, and WeaklyKnown.
They show that fine-tuning on a dataset consisting solely of HighlyKnown examples leads to suboptimal performance, as the model struggles to handle MaybeKnown examples during inference. On the other hand, fine-tuning on a dataset with a mix of HighlyKnown and MaybeKnown examples results in the best overall performance, as it allows the LLM to effectively use its pre-existing knowledge across all subcategories of Known examples.
Fine-tuning enables LLMs to adapt to specific tasks and domains by leveraging the knowledge acquired during pre-training while learning to apply it in a targeted manner.
Fine-tuning allows LLMs to learn the language, terminology, and common queries specific to a particular domain, such as customer support or sales. By training on domain-specific data, the LLM can generate more relevant and accurate responses, leading to improved user experience and satisfaction.
Creating specialised content generation tools
Fine-tuning enables LLMs to learn the style, tone, and structure of content specific to a domain, such as marketing copy, news articles, or creative writing. By exposing the LLM to high-quality examples during fine-tuning, it can generate content that closely mimics the desired style and meets the specific requirements of the target domain.
Building knowledge retrieval systems
Fine-tuning can teach LLMs to identify and retrieve relevant information from a domain-specific knowledge base. By training on examples of questions and their corresponding answers, the LLM learns to understand the context and intent behind user queries and provide accurate and concise responses.
Fine-tuning allows LLMs to specialise in tasks like summarisation, translation, or sentiment analysis by learning from task-specific training data.
For example, fine-tuning on a dataset of document-summary pairs teaches the LLM to identify key information and generate coherent summaries, while fine-tuning on a dataset of text-sentiment pairs enables the LLM to accurately classify the sentiment expressed in a given piece of text.
Fine-tuning LLMs for educational purposes
Fine-tuning can adapt LLMs to generate educational content, such as explanations, quizzes, or personalised learning materials. By training on a dataset of educational content and student interactions, the LLM can learn to generate content that is tailored to the learner's needs, level of understanding, and learning style, ultimately improving the learning experience and outcomes.
In summary, fine-tuning enables LLMs to acquire domain-specific knowledge, learn task-specific patterns and structures, and generate outputs that closely align with the desired behavior and objectives. This adaptability and specialization make fine-tuned LLMs valuable tools for a wide range of practical and commercial applications.