SINTNET

SINTNET [1] is an unbalance-aware network integration method to construct a more reliable and informative  composite network for the automated function prediction of proteins. SINTNET assigns a weight (real number in [0, 1] interval) to the input network according to its 'informativeness' with respect to a given functional class, taking into account the unbalance between annotated and unannotated proteins.


SINTNET code

The source code can be downloaded here. The archive contains two files:
    • SINTNET.R: R code implementing the SINTNET integration method
    • SINTNET_optimization.c: C source code implementing the core optimization procedure of SINTNET
    The C source file must be compiled to produce the shared object SINTNET_optimization.so with the R command
                                                         


Usage of SINTNET: example with yeast networks

We consider a simple example to integrate two networks for the prediction of the FunCat [2] category "01" (Metabolism) in the yeast model organism.

First we must load the SINTNET code in memory:
> source("SINTNET.R")
We use yeast data from the CRAN package 'bionetdata' [3]. We start with binary protein-protein interaction data (Yeast.STRING.data) from the STRING data base [4]; in the bionetdata package these data are represented through a binary named matrix. Names correspond to systematic names of yeast genes. Yeast.STRING.FunCat represents FunCat annotations through a binary matrix  for the genes included in Yeast.STRING.data . Annotations refer to the funcat-2.1 scheme, available from the MIPS web site:

Then we consider binary protein-protein interactions (Yeast.Biogrid.data)  downloaded from the BioGRID database [5], that collects PPI data from both high-throughput studies and conventional focused studies :
Once obtained the SINTNET weights for both the networks, each of the considered networks is extended by including the union of the proteins, and then a weighted integration is performed by using the weights computed by SINTNET:

It is easy to extend this procedure by adding other networks, computing the weights associated to each network through SINTNET. Finally the integrated network can be given as input to a graph-based algorithm (e.g. COSNet [6]) to predict whether the unlabeled nodes/proteins of the network belong to the functional class under study.

References

[1] Frasca, M., Bertoni, A., and Valentini, G. SINTNET: unbalance-aware network integration for the automated function prediction of proteins (submitted)
[2] Ruepp, A. et al. The FunCat, a functional annotation scheme for systematic classification of proteins from whole genomes. Nucleic Acids Research, 32 , 5539-5545, 2004.
[3] Re, M. and Valentini, G. Bionetdata -- CRAN R package: http://cran.r-project.org/web/packages/bionetdata
[4] Von Mering, C. et al. Comparative assessment of large-scale data sets of protein-protein interactions. Nature, 417 , 399-403, 2002.
[5] Stark, C.et al. Biogrid: a general repository for interaction datasets. Nucleic Acids Research, 34 , D535-D539, 2006
[6] Frasca, M., Bertoni, A., Re, M. and Valentini, G. A neural network algorithm for semi-supervised node label learning from unbalanced data. Neural Networks, 43, 84-98, 2013.