co-tracker.github.io - CoTracker: It is Better to Track Together

Description: CoTracker: It is Better to Track Together

Example domain paragraphs

Methods for video motion prediction either estimate jointly the instantaneous motion of all points in a given video frame using optical flow, or track the motion of individual points throughout the video, but independently. The latter is true even for powerful deep learning methods that can track points through occlusions. Tracking points individually ignores the strong correlation that can exist between the points, for instance when they arise from the same physical object, potentially harming performance.

In this paper, we thus propose CoTracker, an architecture that jointly tracks multiple points throughout an entire video. This architecture is based on several ideas from the optical flow and tracking literature, and combines them in a new, flexible and powerful design. It is based on a transformer network that models the correlation of different points in time via specialised attention layers.

We track points sampled on a regular grid starting from the initial video frame. The colors represent the object (magenta) and the background (cyan).

Links to co-tracker.github.io (1)

nikitakaraevv.github.io Nikita Karaev