Tag: film
- MDETR -Modulated Detection for End-to-End Multi-Modal Understanding (16 Jul 2023)
This is my reading note for MDETR -Modulated Detection for End-to-End Multi-Modal Understanding. This paper proposes a method to learn object detection model from pairs of image and tree form text. The trained model is found to be capable of localizing unseen / long tail category.