Tag: multi-task
- Unified Model for Image, Video, Audio and Language Tasks (16 Aug 2023)
This is my reading note on Unified Model for Image, Video, Audio and Language Tasks. This paper proposes a data and compute efficient method to train multi modality model. It’s based on multi-stage and multi task learning: in each stage new modality will be added and the model will be initialized from previous stage.