Learning Disentangled Representation for Multi-Modal Time-Series

Motivation

The orthogonality of modality-shared and modality-specific latent space may be too difficult to satisfy in real-world scenarios. Figure 1 presents an example of the physiological indicators of diabetic patients, in which the signals related to the brain and heart are observed in the time series data. Specifically, Figure 1(a) illustrates the real data generation process. The causal directions from insulin concentration to blood pressure and heart rate demonstrate how diabetes leads to complications such as heart disease and high blood pressure. As shown in Figure 1(b), existing methods apply orthogonal constraints on the estimated latent variables despite the dependence among the true latent sources, which results in variable entanglement and further leads to suboptimal performance in downstream tasks. To address the challenge of dependent latent sources, we propose a multi-modal temporal disentanglement framework to estimate the ground-truth latent variables with identifiability guarantees.

Figure 1. Illustration of physiological indicators of diabetics, where brain-related and heart-related signals are observations. (a) In the true generation process, observations are generated from dependent latent sources. (b) In the estimation process, enforcing orthogonality on estimated sources can result in the entanglement of latent sources and meaningless noises.

Date Generation

To show how to learn disentangled representation for multi-modal time series data, we introduce the data generation process as shown in Figure 2.

Figure 2. Data generation process of time series data with two modalities. The grey and white nodes denote the observed and latent variables, respectively.

Model

Firstly, we obtain the modality-shared and modality-specific latent variables through the modality extractor. During this process, various constraints are employed to ensure that the extracted modality latent variables possess rich semantic information.
Specifically, we first utilize the prior constraints of modality latent variables to guarantee the extraction of modality latent variables that are rich in semantic meaning. Secondly, we apply the modality-sharing constraints to ensure that the modality-shared latent variables extracted from each modality are consistent.
Finally, the obtained modality latent variables will be used for various downstream tasks.
Our model overview is as shown in Figure 3.

Figure 3. Illustration of the proposed MATE model, we consider two modalities for a convenient understanding, more modalities can be easily extended. Modality-specific encoders are used to extract the latent variables of different modalities. The specific prior networks and the shared prior network are used to estimate the prior distribution for KL divergence.

Requirements

Python 3.8
torch == 2.0.1
scikit-learn==1.2.2

Dependencies can be installed using the following command:

pip install -r requirements.txt

Dataset

Please download the dataset from the provided links in the Dataset section.

Motion:

https://archive.ics.uci.edu/ml/datasets/opportunity+activity+recognition#:~:text=Data%20Set%20Information%3A-,The%20OPPORTUNITY%20Dataset%20for%20Human%20Activity%20Recognition%20from%20Wearable%2C%20Object,%2C%20feature%20extraction%2C%20etc).

WiFi:

https://github.com/ermongroup/Wifi_Activity_Recognition

KETI:

https://github.com/Shuheng-Li/Relational-Inference/tree/master/KETI_oneweek

Humaneve:

http://humaneva.is.tue.mpg.de/

h36m:

http://vision.imar.ro/human3.6m/description.php

D1NAMO:

https://github.com/PSI-TAMU/D1NAMO

UCIHAR:

https://archive.ics.uci.edu/dataset/240/human+activity+recognition+using+smartphones

PAMAP2:

https://archive.ics.uci.edu/dataset/231/pamap2+physical+activity+monitoring

HAC and EPIC-Kitchens:

https://huggingface.co/datasets/hdong51/Human-Animal-Cartoon/tree/main

Usage

Supervised Training

python train.py -dataset=[DATASET]

Results

The main results are shown in Table 1.

Table 1. Time series classification for Motion, D1NAMO, WIFI, and KETI datasets.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
src		src
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Learning Disentangled Representation for Multi-Modal Time-Series

Motivation

Date Generation

Model

Requirements

Dataset

Usage

Supervised Training

Results

About

Uh oh!

Releases

Packages

Languages

DMIRLAB-Group/MATE

Folders and files

Latest commit

History

Repository files navigation

Learning Disentangled Representation for Multi-Modal Time-Series

Motivation

Date Generation

Model

Requirements

Dataset

Usage

Supervised Training

Results

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages