Assistant Professor
Computer Science Department
Università degli studi di Milano
Room T304, III floor
Via Comelico 39
Tel: (+39) 02 503 16295/16321
E-mail: frasca [at] di [dot] unimi [dot] it (replace - at - with @ ) Teaching
Publications
Software
|
-
UNIPred
- UNIPred [1] is an unbalance-aware network integration method to construct a
more reliable and informative composite
network for the automated function prediction of proteins. UNIPred
assigns a weight (real number in [0, 1] interval) to
the input network according to its 'informativeness' with respect to a
given functional class, taking into account the unbalance between
annotated and unannotated proteins.
-
-
- UNIPred is also available as web tool.
UNIPred code
- The source code can be downloaded here. The archive contains two files:
- UNIpred.R:
R code implementing the UNIPred integration method
- UNIpred_optimization.c: C source code implementing the core optimization procedure of UNIPred
The C source file must be compiled to produce the shared object UNIPred_optimization.so with the R command
-
Usage of UNIPred: example with yeast networks
- We
consider a simple example to integrate two networks for the prediction
of the FunCat [2] category "01" (Metabolism) in the yeast model
organism.
First we must load the UNIPred code in memory:
- > source("UNIPred.R")
- We use yeast data from the CRAN package 'bionetdata' [3]. We start with binary protein-protein interaction data (Yeast.STRING.data)
from the STRING data base [4]; in the bionetdata package these data are represented through a binary named matrix. Names correspond to systematic names of yeast genes.
Yeast.STRING.FunCat represents
FunCat annotations through a binary matrix for the genes included in Yeast.STRING.data
. Annotations refer to the funcat-2.1 scheme, available from the MIPS web site:
- Then we consider binary protein-protein interactions (Yeast.Biogrid.data)
downloaded from the BioGRID database [5], that collects PPI data from both
high-throughput studies and conventional focused studies :
- Once obtained the UNIPred weights for both the
networks, each of the considered networks is extended by including the union of the proteins,
and then a weighted integration is performed by using the weights computed by UNIPred:
- It is easy to extend this procedure by adding other
networks, computing the weights associated to each network through UNIPred.
Finally the integrated network can be given as input to a graph-based
algorithm (e.g. COSNet [6]) to predict whether the unlabeled
nodes/proteins of the network belong to the functional class under
study.
-
References
[1] M. Frasca, A. Bertoni, G. Valentini. UNIPred: unbalance-aware Network Integration and Prediction of protein functions. Journal of Computational Biology, 22(12):1057–1074, 2015. ISSN 1066-5277. doi: 10.1089/cmb.2014.0110.
[2] Ruepp, A. et al. The FunCat, a functional annotation scheme for
systematic classification of proteins from whole genomes. Nucleic Acids
Research, 32 , 5539-5545, 2004.
[3] Re, M. and Valentini, G. Bionetdata -- CRAN R package: http://cran.r-project.org/web/packages/bionetdata
[4] Von Mering, C. et al. Comparative assessment of large-scale data
sets of protein-protein interactions. Nature, 417 , 399-403, 2002.
[5] Stark, C.et al. Biogrid: a general repository for interaction datasets.
Nucleic Acids Research, 34 , D535-D539, 2006
[6] Frasca, M., Bertoni, A., Re, M. and Valentini, G.
A neural network algorithm for semi-supervised node label learning
from unbalanced data. Neural Networks, 43, 84-98, 2013.
|
|