The Dashing study, “Dashing: fast and accurate genomic distances with HyperLogLog,” authored by Daniel Baker appeared in Genome Biology today. Dashing is a fast and accurate software tool for estimating similarities of genomes or sequencing datasets. It uses the HyperLogLog sketch together with cardinality estimation methods that specialize in set unions and intersections. Dashing sketches…
Kraken 2 in Genome Biology
Our paper describing the Kraken 2 software tool for metagenomic read classification appeared in Genome Biology. Kraken 2’s memory usage is several-fold smaller than Kraken 1’s. It is also about 5-fold faster and adds a translated search mode for better sensitivity when classifying viruses. Like Kraken, Kraken 2 works well with the Bracken tool for…
Kraken 2 preprint
Derrick Wood, Jennifer Lu and Ben Langmead posted a preprint describing the Kraken 2 software tool, written by Derrick. Compared to the popular metagenomics read classification tool, Kraken 1, Kraken 2’s memory usage is several-fold smaller, allowing more reference genomes to be used at once. Kraken 2 is about 5-fold faster than Kraken 1, and adds…
Joint meeting at NIH
The Langmead lab convoyed down to Bethesda for a daylong joint meeting with the labs of Mike Schatz and Adam Phillippy. Adam hosted us in the beautiful “stone house” on the NIH campus. Some photos are below. It was an excellent day of scientific presentations from trainees in all three labs. Thank you for hosting…
Jacob receives doctorate
Congratulations to the lab’s first PhD graduate, Jacob Pritt! He received his doctoral hood in a ceremony at Shriver Hall on Tuesday, May 21, 2019. Best of luck to Jacob who started a position as Junior Bio-Computational Scientist at Mimetics, a company specializing in analysis of dynamic biological networks. Photo courtesy of Nae-Chyun Chen.
Samovar wins best paper
Congratulations to PhD student Charlotte Darby for winning the RECOMB-seq 2019 Best Paper award for her paper “Samovar: Single-sample mosaic SNV calling with linked reads.” The preprint is available at: https://www.biorxiv.org/content/10.1101/560532v1, and the full paper is in press at iScience. Many thanks to the organizers for this honor! Samovar is a new software tool for…
Bioinformatics & open data review
PI Langmead co-authored a review describing how Bioinformatics training, with the assistance of cloud computing and open data in particular, can enhance STEM research in limited-resource countries and Universities. It appeared in Nature Biotechnology today. The review was primarily the work of Serghei Mangul and Lana Martin at UCLA. A trend highlighted here is the…
Samovar preprint
Congratulations to student Charlotte Darby for posting the Samovar preprint! Samovar is a new software tool for detecting mosaic single-nucleotide variants (SNVs) in linked-read whole-genome shotgun sequencing data. It calls mosaic SNVs within a single sample with accuracy comparable to what previously required trios or matched tumor/normal pairs and outperform single-sample mosaic variant callers. The…
Pan-genome work in RECOMB
A pan-genomics study “Efficient Construction of a Complete Index for Pan-Genomics Read Alignment” that we worked on with many collaborators — University of Florida (Boucher Lab), CeBiB and Diego Portales University (Travis Gagie), and University of Eastern Piedmond (Giovanni Manzini) — was accepted at the RECOMB2019 conference in Washington DC. Congratulations to Taher Mun, who…
Dashing preprint
Student Daniel Baker posted a preprint describing Dashing (https://github.com/dnbaker/dashing), a fast and accurate software tool for estimating similarities of genomes or sequencing datasets. It uses the HyperLogLog sketch together with cardinality estimation methods that specialize in set unions and intersections. Dashing sketches genomes more rapidly than previous MinHash-based methods such as Mash or BinDash while…