Barlow Twins

Self Supervised Learning method where objective is redundancy reduction.

Goes something along the lines of:

twin inputs, where a batch of images is passed through a few random destructive augmentations ( remove a corner, shift colors ), creates two distorted views of the same data
processed by two identical networks, usually something like a ResNet 50 encoder along with a MLP projector, to produce embedding vectors.
cross correlation matrix computed between the two embeddings
loss function consists of an invariance term, forcing the diagonal of the matrix to be 1 and ensuring network always invariant to image distortions, and also a redundancy reduction term to push the off-diagonal elements closer to 0, decorrelates different components of representation vector.