ECCV Poster Sequential Representation Learning via Static-Dynamic Conditional Disentanglement

Poster

Sequential Representation Learning via Static-Dynamic Conditional Disentanglement

Mathieu Simon · Pascal Frossard · Christophe De Vleeschouwer

Strong blind review: This paper was not made available on public preprint services during the review process

Strong Double Blind

[ Abstract ] [ Paper PDF ]

[ Poster] [ Supplemental]

2024 Poster

Abstract:

This paper explores self-supervised disentangled representation learning within sequential data, focusing on untangling time-independent and time-varying factors in videos. We propose a new model that explicitly accounts for the causal relationship between the static/dynamic variables and improves model expressivity through additional Normalizing Flows. A formal definition of the factors is proposed. This formalism leads to the derivation of sufficient conditions under which the ground truth factors can be identified, and introduction of a novel theoretically grounded disentanglement constraint that can be directly and efficiently incorporated into the framework. The experiments show that the proposed approach outperforms previous SOTA techniques which generalize poorly in more realistic scenarios where the dynamics of a scene are influenced by the content.

Chat is not available.