October 17 th , 8:00 - 11:00 am EDT
In the visual world, objects rarely occur in isolation. The psychophysical and computational studies have demonstrated that human vision systems can perceive heavily occluded objects with contextual reasoning and association. The question then becomes, can our video understanding system perceive objects that are severely obscured? The OVIS competition will be hosted on an online platform and presentations will be delivered on Zoom.
We use average precision (AP) at different intersection-over-union (IoU) thresholds and average recall (AR) as our evaluation metrics, following Youtube-VIS. The IoU in video instance segmentation is the sum of intersection area over the sum of union area across the video.