Multi-type clustering for the identification of lncRNA-disease relationships

Abstract

IntroductionHigh-throughput sequencing technology, alongside new or improved computational methods, have been crucial for rapid advances in functional genomics. Among the most important results obtained thanks to the introduction of these new technologies, there is the discovery of thousands of non-coding RNAs (ncRNAs) whose function is pivotal for the fine-tuning of the expression of many genes that guide cell development, differentiation, apoptosis and proliferation [2]. Therefore, in the last decade, the number of papers reporting evidences about ncRNAs involvement in human complex diseases, such as cancer, is grown at an exponential rate. Among the different classes of ncRNAs, the most investigated one is that of microRNAs (miRNAs), which are small molecules (20-22nt long) that regulate the expression of genes through the modulation of the translation of their transcripts [4]. Much less is known about the functional involvement of long non-coding RNAs (lncRNAs), represented by RNA molecules longer than 200 nt, that have been recently discovered to have a plethora of regulatory functions spanning from chromatin modifications to post-transcriptional regulation [8]. However, the number of lncRNAs for which the functional characterization is available is still quite poor. Assessing the role and, especially, the molecular mechanisms underlining the involvement of lncRNAs in human diseases, is not a trivial task.Most of existing approaches are based on expensive experimental evaluations or on computational methods which exploit known/verified relationships among the lncRNA and the disease [6]. However, because of the complex functional interactions that lncRNAs can establish with other regulatory RNAs (i.e., miRNAs) or proteins, considering only the evidences of a direct relationship between lncRNAs and diseases may be very limiting. Some recent works started to consider further related information, but they do not consider possible dependencies among the relationships, but analyze single relationships independently. This corresponds to the assumption that all the instances follow the same probability distribution and that are independent to each other. In this case such assumption is easily violated, since different lncRNAs can be involved in the development of the same disease, as well as different diseases can be related to each other on the basis of the involvement of common lncRNAs or other regulatory entities such as miRNAs. To overcome these limitations we propose a computational method which is able to predict possibly unknown relationships between lncRNA and diseases by exploiting different in- formation about an heterogeneous set of (related) biological entities. In particular, we focus on lncRNAs, miRNAs, target genes and diseases, as well as on known relationships among these entities (see Figure 1). The proposed method is based on a clustering algorithm which is able to group objects of multiple types and to predict possibly


Autore Pugliese

Tutti gli autori

  • G. Pio; F. Serafino; E. Barracchia; D. D'Elia ; M. Ceci

Titolo volume/Rivista

Non Disponibile


Anno di pubblicazione

2016

ISSN

Non Disponibile

ISBN

Non Disponibile


Numero di citazioni Wos

Nessuna citazione

Ultimo Aggiornamento Citazioni

Non Disponibile


Numero di citazioni Scopus

Non Disponibile

Ultimo Aggiornamento Citazioni

Non Disponibile


Settori ERC

Non Disponibile

Codici ASJC

Non Disponibile