Tag: masked-video-prediction
- InternVideo General Video Foundation Models via Generative and Discriminative Learning (06 Aug 2023)
This is my reading note for InternVideo: General Video Foundation Models via Generative and Discriminative Learning. This paper propose to train a multi-modality model for video by utilizes both masked video prediction and contrast loss. However, this paper uses a encoder-decoder for masked video prediction and the other video encoder for contrast loss