Random Forest Classification Based on Star Graph Topological Indices for Antioxidant Proteins

View/ Open
Use this link to cite
http://hdl.handle.net/2183/19525
Except where otherwise noted, this item's license is described as Atribución-NoComercial-SinDerivadas 3.0 España
Collections
- GI-RNASA - Artigos [163]
Metadata
Show full item recordTitle
Random Forest Classification Based on Star Graph Topological Indices for Antioxidant ProteinsDate
2012-10-29Citation
Fernández-Blanco E, Aguiar-Pulido V, Munteanu CR, Dorado J. J Theor Biol. 2012;317:331-337
Abstract
[Abstract] Aging and life quality is an important research topic nowadays in areas such as life sciences, chemistry, pharmacology, etc. People live longer, and, thus, they want to spend that extra time with a better quality of life. At this regard, there exists a tiny subset of molecules in nature, named antioxidant proteins that may influence the aging process. However, testing every single protein in order to identify its properties is quite expensive and inefficient. For this reason, this work proposes a model, in which the primary structure of the protein is represented using complex network graphs that can be used to reduce the number of proteins to be tested for antioxidant biological activity. The graph obtained as a representation will help us describe the complex system by using topological indices. More specifically, in this work, Randić’s Star Networks have been used as well as the associated indices, calculated with the S2SNet tool. In order to simulate the existing proportion of antioxidant proteins in nature, a dataset containing 1999 proteins, of which 324 are antioxidant proteins, was created. Using this data as input, Star Graph Topological Indices were calculated with the S2SNet tool. These indices were then used as input to several classification techniques. Among the techniques utilised, the Random Forest has shown the best performance, achieving a score of 94% correctly classified instances. Although the target class (antioxidant proteins) represents a tiny subset inside the dataset, the proposed model is able to achieve a percentage of 81.8% correctly classified instances for this class, with a precision of 81.3%.
Keywords
Multi-target QSAR
Star Graph
Topological indices
Antioxidant protein
Star Graph
Topological indices
Antioxidant protein
Editor version
Rights
Atribución-NoComercial-SinDerivadas 3.0 España
ISSN
0022-5193
1095-8541
1095-8541