0

Variance Covariance Regularization Enforces Pairwise Independence in Self-Supervised Representations

Self-Supervised Learning (SSL) methods such as VICReg, Barlow Twins or W-MSE avoid collapse of their joint embedding architectures by constraining or regularizing the covariance matrix of their projector's output. This study highlights important …

On Inductive Biases for Machine Learning in Data-Constrained Settings

Learning with limited data is one of the biggest problems of deep learning. Current, popular approaches to this issue consist in training models on huge amounts of data, labelled or not, before re-training the model on a smaller dataset of interest …

GraphiT: Encoding Graph Structure in Transformers

We show that viewing graphs as sets of node features and incorporating structural and positional information into a transformer architecture is able to outperform representations learned with classical graph neural networks (GNNs). Our model, …

A Trainable Optimal Transport Embedding for Feature Aggregation and its Relationship to Attention

ICLR 2021