Tag: reinforcement-learning-human-feedback

Aligning Large Multimodal Models with Factually Augmented RLHF (02 Oct 2023)

This is my reading note for Aligning Large Multimodal Models with Factually Augmented RLHF. This paper discusses how to mitigate hallucination for large multimodal model.it proposes two methods, 1) add additional human labeled data to train a reward model to guide the fine tune of the final model: 2) add additional factual data to the reward model besides model’s response.