Deep representation learning using layer-wise VICReg losses
1
Issued Date
2025-12-01
Resource Type
eISSN
20452322
Scopus ID
2-s2.0-105011747236
Journal Title
Scientific Reports
Volume
15
Issue
1
Rights Holder(s)
SCOPUS
Bibliographic Citation
Scientific Reports Vol.15 No.1 (2025)
Suggested Citation
Datta J., Rabbi R., Saha P., Zereen A.N., Abdullah-Al-Wadud M., Uddin J. Deep representation learning using layer-wise VICReg losses. Scientific Reports Vol.15 No.1 (2025). doi:10.1038/s41598-025-08504-2 Retrieved from: https://repository.li.mahidol.ac.th/handle/123456789/111478
Title
Deep representation learning using layer-wise VICReg losses
Corresponding Author(s)
Other Contributor(s)
Abstract
This paper presents a layer-wise training procedure of neural networks by minimizing a Variance-Invariance-Covariance Regularization (VICReg) loss at each layer. The procedure is beneficial when annotated data are scarce but enough unlabeled data are present. Being able to update the parameters locally at each layer also handles problems such as vanishing gradient and initialization sensitivity in backpropagation. The procedure utilizes two forward passes instead of one forward and one backward pass as done in backpropagation, where one forward pass works on original data and the other on an augmented version of the data. It is shown that this procedure can construct more compact but informative spaces progressively at each layer. The architecture of the model is selected to be pyramidal, enabling effective feature extraction. In addition, we optimize weights for variance, invariance, and covariance terms of the loss function so that the model can capture higher-level semantic information optimally. After training the model, we assess its learned representations by measuring clustering quality metrics and performance on classification tasks utilizing a few labeled data. To evaluate the proposed approach, we do several experiments with different datasets: MNIST, EMNIST, Fashion MNIST, and CIFAR-100. The experimental results show that the training procedure enhances the classification accuracy of Deep Neural Networks (DNNs) trained on MNIST, EMNIST, Fashion MNIST, and CIFAR-100 by approximately 7%, 16%, 1%, and 7% respectively compared to the baseline models of similar architectures.
