-
SINTNET
- SINTNET
[1] is an unbalance-aware network integration method to construct a
more reliable and informative composite
network for the automated function prediction of proteins. SINTNET
assigns a weight (real number in [0, 1] interval) to
the input network according to its 'informativeness' with respect to a
given functional class, taking into account the unbalance between
annotated and unannotated proteins.
-
SINTNET code
- The source code can be downloaded here. The archive contains two files:
- SINTNET.R: R code implementing the SINTNET integration method
- SINTNET_optimization.c: C source code implementing the core optimization procedure of SINTNET
- The C source file must be compiled to produce the shared object SINTNET_optimization.so with the R command
-
-
Usage of SINTNET: example with yeast networks
- We
consider a simple example to integrate two networks for the prediction
of the FunCat [2] category "01" (Metabolism) in the yeast model
organism.
First we must load the SINTNET code in memory:
- > source("SINTNET.R")
- We use yeast data from the CRAN package 'bionetdata' [3]. We start with binary protein-protein interaction data (Yeast.STRING.data)
from the STRING data base [4]; in the bionetdata package these data are represented through a binary named matrix. Names correspond to systematic names of yeast genes.
Yeast.STRING.FunCat represents
FunCat annotations through a binary matrix for the genes included in Yeast.STRING.data
. Annotations refer to the funcat-2.1 scheme, available from the MIPS web site:
- Then we consider binary protein-protein interactions (Yeast.Biogrid.data)
downloaded from the BioGRID database [5], that collects PPI data from both
high-throughput studies and conventional focused studies :
- Once obtained the SINTNET weights for both the
networks, each of the considered networks is extended by including the union of the proteins,
and then a weighted integration is performed by using the weights computed by SINTNET:
- It is easy to extend this procedure by adding other
networks, computing the weights associated to each network through
SINTNET.
Finally the integrated network can be given as input to a graph-based
algorithm (e.g. COSNet [6]) to predict whether the unlabeled
nodes/proteins of the network belong to the functional class under
study.
-
References
-
[1] Frasca, M., Bertoni, A., and Valentini, G.
SINTNET: unbalance-aware network integration for the automated function prediction of proteins (submitted)
[2] Ruepp, A. et al. The FunCat, a functional annotation scheme for
systematic classification of proteins from whole genomes. Nucleic Acids
Research, 32 , 5539-5545, 2004.
[3] Re, M. and Valentini, G. Bionetdata -- CRAN R package: http://cran.r-project.org/web/packages/bionetdata
[4] Von Mering, C. et al. Comparative assessment of large-scale data
sets of protein-protein interactions. Nature, 417 , 399-403, 2002.
[5] Stark, C.et al. Biogrid: a general repository for interaction datasets.
Nucleic Acids Research, 34 , D535-D539, 2006
[6] Frasca, M., Bertoni, A., Re, M. and Valentini, G.
A neural network algorithm for semi-supervised node label learning
from unbalanced data. Neural Networks, 43, 84-98, 2013.
|