Deep representation learning using layer-wise VICReg losses

dc.contributor.authorDatta J.
dc.contributor.authorRabbi R.
dc.contributor.authorSaha P.
dc.contributor.authorZereen A.N.
dc.contributor.authorAbdullah-Al-Wadud M.
dc.contributor.authorUddin J.
dc.contributor.correspondenceDatta J.
dc.contributor.otherMahidol University
dc.date.accessioned2025-08-02T18:12:12Z
dc.date.available2025-08-02T18:12:12Z
dc.date.issued2025-12-01
dc.description.abstractThis paper presents a layer-wise training procedure of neural networks by minimizing a Variance-Invariance-Covariance Regularization (VICReg) loss at each layer. The procedure is beneficial when annotated data are scarce but enough unlabeled data are present. Being able to update the parameters locally at each layer also handles problems such as vanishing gradient and initialization sensitivity in backpropagation. The procedure utilizes two forward passes instead of one forward and one backward pass as done in backpropagation, where one forward pass works on original data and the other on an augmented version of the data. It is shown that this procedure can construct more compact but informative spaces progressively at each layer. The architecture of the model is selected to be pyramidal, enabling effective feature extraction. In addition, we optimize weights for variance, invariance, and covariance terms of the loss function so that the model can capture higher-level semantic information optimally. After training the model, we assess its learned representations by measuring clustering quality metrics and performance on classification tasks utilizing a few labeled data. To evaluate the proposed approach, we do several experiments with different datasets: MNIST, EMNIST, Fashion MNIST, and CIFAR-100. The experimental results show that the training procedure enhances the classification accuracy of Deep Neural Networks (DNNs) trained on MNIST, EMNIST, Fashion MNIST, and CIFAR-100 by approximately 7%, 16%, 1%, and 7% respectively compared to the baseline models of similar architectures.
dc.identifier.citationScientific Reports Vol.15 No.1 (2025)
dc.identifier.doi10.1038/s41598-025-08504-2
dc.identifier.eissn20452322
dc.identifier.scopus2-s2.0-105011747236
dc.identifier.urihttps://repository.li.mahidol.ac.th/handle/123456789/111478
dc.rights.holderSCOPUS
dc.subjectMultidisciplinary
dc.titleDeep representation learning using layer-wise VICReg losses
dc.typeArticle
mu.datasource.scopushttps://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=105011747236&origin=inward
oaire.citation.issue1
oaire.citation.titleScientific Reports
oaire.citation.volume15
oairecerif.author.affiliationKing Saud University
oairecerif.author.affiliationMahidol University
oairecerif.author.affiliationBRAC University
oairecerif.author.affiliationRajshahi University of Engineering and Technology
oairecerif.author.affiliationWoosong University

Files

Collections