Splicing diversity in public RNA-seq data

Using Rail-RNA, we processed a huge collection of publicly-available human RNA sequencing data in order to better understand the landscape of splicing events across diverse biological conditions and projects. We aligned 21,504 human RNA-seq samples from the Sequence Read Archive (SRA) to the human genome and compared detected exon-exon junctions with junctions in several recent gene annotations. 56,865 junctions (18.6%) found in at least 1,000 samples were not annotated, and their expression associated with tissue type. Newer samples contributed few novel well-supported junctions, with 96.1% of junctions detected in at least 20 reads across samples present in samples before 2013. Junction data is compiled into a resource called intropolis available at http://intropolis.rail.bio. We also discuss an application of this resource to cancer involving a recently validated isoform of the ALK gene.


Published:January 30, 2016


Bookmark the permalink

Both comments and trackbacks are currently closed.