PhD student Chris Wilks’s paper describing Snaptron, a search engine for mRNA splicing patterns, appeared in the journal Bioinformatics today. Snaptron is a search engine for querying splicing patterns in large, pre-analyzed collections of human RNA sequencing (RNA-seq) samples. Snaptron lends valuable context and support to hypotheses related to splicing patterns in human. Snaptron’s query planner combines the strengths of different indexing strategies — R-trees, B-trees and term- document inverted indices — to rapidly answer queries. Queries can be tailored by constraining which junctions and samples to consider. Snaptron can score junctions according to tissue specificity or other criteria, and can score samples according to the relative frequency of different splicing patterns.
The easiest way to use Snaptron is via its RESTful web service interface (http://snaptron.cs.jhu.edu), which allows researchers to immediately start posing queries (e.g. simply starting with a gene name) with no extra software installation. Most queries take a few seconds and can be tailored by constraining which junctions and samples to consider. Snaptron can score junctions according to tissue specificity or other criteria. Importantly, Snaptron can also score samples according to alternative splicing patterns by calculating the “percent spliced in” of individual exons. Using this framework, we have identified hundreds of previously unannotated cell type-specific exons and the splicing factors that regulate these exons. The manuscript highlights several case studies relevant to human disease to illustrate the versatility of Snaptron.