OWLv2

Open-World Localization with vision-language models

Overview

TODO: Add content