Statistical analysis of proteins families: a network and random matrix approach
Issued Date
2024-10-01
Resource Type
ISSN
14346028
eISSN
14346036
Scopus ID
2-s2.0-85206220280
Journal Title
European Physical Journal B
Volume
97
Issue
10
Rights Holder(s)
SCOPUS
Bibliographic Citation
European Physical Journal B Vol.97 No.10 (2024)
Suggested Citation
Kumari R., Bhadola P., Deo N. Statistical analysis of proteins families: a network and random matrix approach. European Physical Journal B Vol.97 No.10 (2024). doi:10.1140/epjb/s10051-024-00781-6 Retrieved from: https://repository.li.mahidol.ac.th/handle/20.500.14594/101669
Title
Statistical analysis of proteins families: a network and random matrix approach
Author(s)
Author's Affiliation
Corresponding Author(s)
Other Contributor(s)
Abstract
Abstract: We present a novel method for analyzing the structural organization of protein families by integrating random matrix theory (RMT) and network theory with the physiochemical properties of amino acids and multiple sequence alignment. RMT distinguishes significant interactions between amino acids from background noise, pinpointing coevolving positions likely crucial for protein structure and function. This property-based approach captures both short and long-range correlations, unlike previous methods that treat amino acids as mere characters. The eigenvector components of eigenvalues outside the RMT bound deviate from typical RMT observations, offering critical system information. We quantify the information content of each eigenvector using an entropic estimate, showing that the smallest eigenvectors are highly localized and informative. These eigenvectors form clusters of biologically and structurally significant positions, validated by experiments. By creating networks of amino acid interactions for each property, we uncover key motifs and interactions. This method enhances our understanding of protein evolution, interactions, and potential targets to modulate enzymatic actions. We study two protein families Cadherin-4 and Betalactamase families which display two extreme characteristics one nearly random and the other very structured or organised. Graphical abstract: (Figure presented.)