A NOVEL GENERAL-PURPOSE RNA-SEQ PROTOCOL OPTIMIZING THE DETECTION OF TRANSCRIPTOME EXPRESSION COMPLEXITY
Abstract
Recent studies have demonstrated an unexpected complexity of transcription in eukaryotes.Indeed the majority of the genome is transcribed and only a little fraction of these transcripts isannotated as protein coding genes and their splice variants. Therefore high throughput transcriptomesequencing continuously identifies novel RNAs and novel classes of RNAs, which are the result ofantisense, overlapping and non-coding RNA expression, demonstrating that the transcriptomecaptures a level of complexity that the simple genome sequence may not (1).Among next-generation sequencing platforms, the latest series of Roche 454 GS Sequencer, the GSFLX Titanium FLX+, allows to obtain in each run over a million reads, each with a length up to 700base. Sequences of such length, providing connectivity information among splicing sites, in additionto enabling accurate mapping and relative quantification of mRNAs, are particularly suitable for thecharacterization of full-length splicing variants that may be differently expressed inphysiopathological conditions (2). On the other hand the higher throughput of the Illumina HiSeq1000 (150 bp) and ABI SOLID (75 bp) platforms, makes them particularly suitable for transcriptslevel quantification and for small RNAs sequencing.Irrespectively of the NGS platform used, the first step required for transcriptome sequencing is theconstruction of a cDNA library. Several protocols have been developed so far to this aim and eachof them is suitable for sequencing on a specific platform exclusively.Here we describe a new fast and simple method (Patent pending RM2010A000293-PCT/IB2011/052369) to prepare and amplify a representative and strand-specific cDNA librarystarting from low input total RNA (500ng) for RNA-Seq applications, that may be implemented withall major platforms currently available (Roche 454, Illumina, ABI/Solid).Our method includes the following steps: a) rRNA removal from total RNA b) retrotranscription ofthe rRNA-depleted RNA to cDNA with 5' phosphorylated Tag-random-octamers custom designedcapable of preserving strand information; c) single-strand cDNAs purification; d) ligation andamplification of the purified cDNAs, thus obtaining high yield of concatamers around 20kb long.These DNA molecules can be equally sequenced both with Illumina and Roche 454 sequencingplatforms allowing not only the quantitative but also the qualitative assessment of the transcriptomecomplexity.Moreover, we developed a suitable bioinformatic pipeline for the analysis of the sequences producedupon application of this protocol. Indeed, we developed an in house python script, named Tag_Find(available upon request), able to recognize the position and the type of tag found within the readsequence. The program returns out two files, one containing the type of tags found and their readspositions and one fastq file with non-tagged reads, cleaned up from tags. The Tag_Find efficiency
Autore Pugliese
Tutti gli autori
-
C. Calabrese; M. Mangiulli; C. Manzari; A.M. Paluscio; M.F. Caratozzolo; F. Marzano; I. Kurelac; A.M. D'Erchia; D. D'Elia; F. Licciulli; S. Liuni; E. Picardi; M. Attimonelli; G. Gasparre;A.M. Porcelli; G. Pesole;; E. Sbisà; A. Tullo
Titolo volume/Rivista
Non Disponibile
Anno di pubblicazione
2012
ISSN
Non Disponibile
ISBN
Non Disponibile
Numero di citazioni Wos
Nessuna citazione
Ultimo Aggiornamento Citazioni
Non Disponibile
Numero di citazioni Scopus
Non Disponibile
Ultimo Aggiornamento Citazioni
Non Disponibile
Settori ERC
Non Disponibile
Codici ASJC
Non Disponibile
Condividi questo sito sui social