Skip to yearly menu bar Skip to main content


Poster

Sequential Representation Learning via Static-Dynamic Conditional Disentanglement

Mathieu Cyrille Simon · Pascal Frossard · Christophe De Vleeschouwer

Strong blind review: This paper was not made available on public preprint services during the review process Strong Double Blind
[ ]
Thu 3 Oct 7:30 a.m. PDT — 9:30 a.m. PDT

Abstract:

This paper explores self-supervised disentangled representation learning within sequential data, focusing on untangling time-independent and time-varying factors in videos. We propose a new model that explicitly accounts for the causal relationship between the static/dynamic variables and improves model expressivity through additional Normalizing Flows. A formal definition of the factors is proposed. This formalism leads to the derivation of sufficient conditions under which the ground truth factors can be identified, and introduction of a novel theoretically grounded disentanglement constraint that can be directly and efficiently incorporated into the framework. The experiments show that the proposed approach outperforms previous SOTA techniques which generalize poorly in more realistic scenarios where the dynamics of a scene are influenced by the content.

Live content is unavailable. Log in and register to view live content