Including preprints. Google Scholar version.
2021
Kempa D and Langmead B. Fast and space-efficient construction of AVL grammars from the LZ77 parsing. To appear in ESA conference, 2021.
Mun T, Chen NC, Langmead B. LevioSAM: Fast lift-over of variant-aware reference alignments. Bioinformatics. May 25:btab396, 2021.
Describes the LevioSAM software for translating (“lifting”) alignments between reference genomes
Pre-built LevioSAM indexes, e.g. for human major-allele references, are available from the Bowtie websites
Ahmed O, Rossi M, Kovaka S, Schatz MC, Gagie T, Boucher C, Langmead B. Pan-genomic matching statistics for targeted nanopore sequencing. iScience. Jun 8;24(6):102696., 2021
Describes the SPUMONI software for real-time sequence classification
Boucher C, Gagie T, I T, Köppl D, Langmead B, Manzini G, Navarro G, Pacheco A, Rossi M. PHONI: Streamed matching statistics with multi-genome references. Data Compression Conference (DCC), 2021.
Rossi M, Oliva M, Langmead B, Gagie T, Boucher C. MONI: A pangenomics index forfinding mems. To appear in RECOMB conference, 2021.
Wilks C, Ahmed O, Baker DN, Zhang D, Collado-Torres L, Langmead B. Megadepth: efficient coverage quantification for bigwigs and bams. Bioinformatics. In press, 2021.
Describes the Megadepth software for quantifying genomic intervals from bigWig and BAM files
Chen NC, Solomon B, Mun T, Iyer S, Langmead B. Reference flow: reducing reference bias using multiple population genomes. Genome Biology, 22(1):8, Jan 2021.
Describes the Reference flow software and framework for avoiding reference bias by aligning to multiple references
2020
Darby CA, Gaddipati R, Schatz MC, Langmead, B. Vargas: heuristic-free alignment for assessing linear and graph read aligners. Bioinformatics, 36(12):3712–3718, Jun 2020.
Describes the Vargas software for heuristic-free alignment to linear and graph genomes
Kuhnle A, Mun T, Boucher C, Gagie T, Langmead B, Manzini G. Efficient Construction of a Complete Index for Pan-Genomics Read Alignment. Journal of Computational Biology, 27(4):500–513, Apr 2020.
Describes the r-index software for indexing and matching against pan-genome collections
Imada EL, Sanchez DF, Collado-Torres L, Wilks C, Matam T, Dinalankara W, Stupnikov A, Lobo-Pereira F, Yip CW, Yasuzawa K, Kondo N, Itoh M, Suzuki H, Kasukawa T, Hon CC, de Hoon MJL, Shin HW, Carninci P, Jaffe AE, Leek JT, Favorov A, Franco GR, Langmead B, Marchionni L. Recounting the FANTOM CAGE-Associated Transcriptome. Genome Research, 30(7):1073–1081, Jul 2020.
Ling JP, Wilks C, Charles R, Leavey PJ, Ghosh D, Jiang L, Santiago CP, Pang B, Venkataraman A, Clark BS, Nellore A, Langmead B, Blackshaw S. ASCOT identifies key regulators of neuronal subtype-specific splicing. Nature Communications, 11(1):137, Jan 2020.
Describes the ASCOT resource for exploring alternative splicing
2019
Baker DN, Langmead, B. Dashing: fast and accurate genomic distances with Hyper-LogLog. Genome Biology, 20(1):265, Dec 2019.
Describes the Dashing software tool for genomic sketching and distance estimation
Wood BE, Lu J, Langmead B. Improved metagenomic analysis with Kraken 2. Genome Biology, 20(1):257, Nov 2019.
Describes the Kraken 2 software tool for metagenomic classification
Wulfridge P, Langmead B, Feinberg AP, Hansen KD. Analyzing whole genome bisulfite sequencing data from highly divergent genotypes. Nucleic Acids Research. 47(19):e117, Nov 2019.
Madugundu AK, Na CH, Nirujogi RS, Renuse S, Kim KP, Burns KH, Wilks C, Langmead B, Ellis SE, Collado-Torres L, Halushka MK, Kim MS, Pandey A. Integrated Transcriptomic and Proteomic Analysis of Primary Human Umbilical Vein Endothelial Cells. Proteomics. 2019 Aug;19(15):e1800315.
Darby CA, Fitch JR, Brennan PJ, Kelly BJ, Bir N, Magrini V, Leonard J, Cottrell CE, Gastier-Foster JM, Wilson RK, Mardis ER, White P, Langmead B, Schatz MC. Samovar: Single-Sample Mosaic Single-Nucleotide Variant Calling with Linked Reads. iScience. 2019 May 29;18:1-10.
Describes the Samovar software tool for mosaic variant detection from linked-read data
Winner of best paper at RECOMB-seq
Boucher C, Gagie T, Kuhnle A, Langmead B, Manzini G, Mun T. Prefix-free parsing for building big BWTs. Algorithms for Molecular Biology. 2019 May 24;14:13.
Kuhnle A, Mun T, Boucher C, Gagie T, Langmead B, Manzini G. Efficient Construction of a Complete Index for Pan-Genomics Read Alignment. 2019 Apr. Research in Computational Molecular Biology (RECOMB), pp 158-173.
Mangul S, Martin LS, Langmead B, Sanchez-Galan JE, Toma I, Hormozdiari F, Pevzner P, Eskin E. How bioinformatics and open data can boost basic science in countries and universities with limited resources. Nature Biotechnology. 2019 Mar;37(3):324-326.
2018
Langmead B, Wilks C, Antonescu V, Charles R. Scaling read aligners to hundreds of threads on general-purpose processors. Bioinformatics. 2019 Feb 1;35(3):421-432.
Pritt J, Chen N, Langmead B. FORGe: prioritizing variants for graph genomes. Genome Biology. 2018 Dec 17;19(1):220.
Describes the FORGe software tool
Langmead B, Nellore A. Cloud computing for genomic data analysis and collaboration. Nature Reviews Genetics. 2018 May;19(5):325.
Wilks C, Gaddipati P, Nellore A, Langmead B. Snaptron: querying splicing patterns across tens of thousands of RNA-seq samples. Bioinformatics. 2018 Jan 1;34(1):114-116.
Describes the Snaptron web service and software (client and server).
2017
Nellore A, Collado-Torres L, Jaffe AE, Alquicira-Hernández J, Wilks C, Pritt J, Morton J, Leek JT, Langmead B. Rail-RNA: scalable analysis of RNA-seq splicing and coverage. Bioinformatics. 2017 Dec 15;33(24):4033-4040.
Describes the Rail-RNA software tool, as presented at HiTSeq 2016.
Langmead B. A tandem simulation framework for predicting mapping quality. Genome Biology. 2017 Aug 10;18(1):152.
Describes the Qtip software.
Collado-Torres L, Nellore A, Kammers K, Ellis SE, Taub MA, Hansen KD, Jaffe AE, Langmead B, Leek JT. Reproducible RNA-seq analysis using recount2. Nature Biotechnology. 2017 Apr 11;35(4):319-321.
Describes the recount resource and Bioconductor package.
Collado-Torres L, Nellore A, Frazee AC, Wilks C, Love MI, Langmead B, Irizarry RA, Leek JT, Jaffe AE. Flexible expressed region analysis for RNA-seq with derfinder. Nucleic Acids Research. 2017 Jan 25;45(2):e9.
Describes the derfinder differential expression tool.
2016
Nellore A, Jaffe AE, Fortin JP, Alquicira-Hernández J, Collado-Torres L, Wang S, Phillips RA, Karbhari N, Hansen KD, Langmead B, Leek JT. Human splicing diversity and the extent of unannotated splice junctions across human RNA-seq samples on the Sequence Read Archive . Genome Biology. 2016, 17:266.
Describes the Intropolis resource. Research highlight by Robert & Watson.
Darby MM, Leek JT, Langmead B, Yolken RH, Sabunciyan S. Widespread splicing of repetitive element loci into coding regions of gene transcripts. Molecular Genetics. 2016 Nov 15;25(22):4962-4982.
Pritt J, Langmead B. Boiler: lossy compression of RNA-seq alignments using coverage vectors. Nucleic Acids Research. 2016 Sep 19;44(16):e133.
Describes the Boiler RNA-seq alignment compression tool.
Nellore A, Wilks C, Hansen KD, Leek JT, Langmead B. Rail-dbGaP: analyzing dbGaP-protected data in the cloud with Amazon Elastic MapReduce. Bioinformatics. 2016 Aug 15;32(16):2551-3.
Describes the Rail-dbGaP software and protocol.
The Computational Pan-Genomics Consortium (incl Langmead B). Computational pan-genomics: Status, promises and challenges. Briefing in Bioinformatics. 2016 Oct 21.
2015
Reinert K, Langmead B, Weese D, Evers DJ. Alignment of Next-Generation Sequencing Reads. Annual Reviews: Genomics and Human Genetics. 2015;16:133-51.
Frazee AC, Jaffe AE, Langmead B, Leek JT. Polyester: simulating RNA-seq datasets with differential transcript expression. Bioinformatics. 2015 Sep 1;31(17):2778-84.
Describes the Polyester software tool.
Kim D, Langmead B, Salzberg S. HISAT: a fast spliced aligner with low memory requirements. Nature Methods 2015 Apr;12(4):357-60.
Describes the HISAT software tool, based on Bowtie 2.
Frazee AC, Pertea G, Jaffe AE, Langmead B, Salzberg SL, Leek JT. Ballgown bridges the gap between transcriptome assembly and expression analysis. Nature Biotechnology 2015 Mar;33(3):243-6.
Describes the Ballgown software tool.
Wilton R, Budavari T, Langmead B, Wheelan S, Salzberg S, Szalay, A. Faster sequence alignment through GPU-accelerated restriction of the seed-and-extend search space. PeerJ 2015 3:e808.
Describes the Arioc software tool.
2014
Frazee AC, Collado Torres L, Jaffe AE, Langmead B, Leek JT. Measurement, Summary, and Methodological Variation in RNA-sequencing. Statistical Analysis of Next Generation Sequencing Data. Springer International Publishing, 2014. 115-128.
Song L, Florea L, Langmead B. Lighter: fast and memory-efficient error correction without counting. Genome Biology,2014 Nov 15;15(11):509.
Describes the Lighter software tool.
Hansen KD, Sabunciyan S, Langmead B, Nagy N, Curley R, Klein G, Klein E, Salamon D, Feinberg AP. Large-scale hypomethylated blocks associated with Epstein-Barr virus-induced B-cell immortalization. Genome Research. 2014 Feb;24(2):177-84.
2013
Schatz MC, Langmead B. The DNA Data Deluge. IEEE Spectrum. July, 2013.
Slashdotted. JHU news release and magazine article.
2012
Herb BR, Wolschin F, Hansen KD, Aryee MJ, Langmead B, Irizarry R, Amdam GV, Feinberg AP. Reversible switching between epigenetic states in honeybee behavioral subcastes. Nature Neuroscience. 2012 Oct;15(10):1371-3.
Johns Hopkins Medicine news piece
Gurtowski J, Schatz MC, Langmead B. Genotyping in the cloud with crossbow. Curr Protoc Bioinformatics. 2012 Sep;Chapter 15:Unit15.3.
KD Hansen*, Langmead B*, Irizarry RA. BSmooth: from whole genome bisulfite sequencing reads to differentially methylated regions. Genome Biology, 2012;13:R83. * Equal contribution
Describes the BSmooth software tool.
Langmead B, Salzberg S. Fast gapped-read alignment with Bowtie 2. Nature Methods. 2012, 9:357-359.
Describes the Bowtie 2 software tool. Selected for author profile.
2011
Frazee A, Langmead B, Leek JT. ReCount: A multi-experiment resource of analysis-ready RNA-seq gene count datasets. BMC Bioinformatics. 2011, 12:449.
Describes the ReCount database.
Hansen KD*, Timp W*, Corrada Bravo H*, Sabunciyan S*, Langmead B*, McDonald OG, Wen B, Wu H, Liu Y, Diep D, Briem E, Zhang K, Irizarry RA, Feinberg AP. Increased methylation variation in epigenetic domains across cancer types .Nature Genetics. 2011 Jun 26;43(8):768-75. * Equal contribution
Langmead B. Aligning Short Sequencing Reads with Bowtie. Curr Protoc Bioinformatics. 2010 Dec;Chapter 11:Unit 11.7.
2010
Leek JT, Scharpf RB, Corrada Bravo H, Simcha D, Langmead B, Johnson WE, Geman D, Baggerly K, Irizarry RA. Tackling the widespread and critical impact of batch effects in high-throughput data. Nature Reviews Genetics. 2010 Sep 14.
Langmead B, Hansen KD, Leek JT. Cloud-scale RNA-sequencing differential expression analysis with Myrna. Genome Biology. 2010;11(8):R83
Describes the Myrna software tool.
Langmead B. Cloud Computing for Data Analysis: Toward the Plateau of Productivity. Bio IT-World. 2010 August; Vol. 9, No. 4: 36.
Schatz MC, Langmead B, Salzberg SL. Cloud computing and the DNA data race.Nature Biotechnology. 2010 Jul;28(7):691-3.
2009
Langmead B Highly Scalable Short Read Alignment with the Burrows-Wheeler Transform and Cloud Computing 2009; Master’s thesis, University of Maryland.
Langmead B, Schatz MC, Lin J, Pop M, Salzberg SL. Searching for SNPs with cloud computing. Genome Biology. 2009;10(11):R134
Describes the Crossbow software tool.
Langmead B, Trapnell C, Pop M, Salzberg SL. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biology. 2009;10(3):R25
Describes the Bowtie software tool. Winner: Genome Biology Award for outstanding article in the journal Genome Biology in 2009. Selected for minireview: The Need for Speed