Title |
BAMQL: a query language for extracting reads from BAM files
|
---|---|
Published in |
BMC Bioinformatics, August 2016
|
DOI | 10.1186/s12859-016-1162-y |
Pubmed ID | |
Authors |
Andre P. Masella, Christopher M. Lalansingh, Pragash Sivasundaram, Michael Fraser, Robert G. Bristow, Paul C. Boutros |
Abstract |
It is extremely common to need to select a subset of reads from a BAM file based on their specific properties. Typically, a user unpacks the BAM file to a text stream using SAMtools, parses and filters the lines using AWK, then repacks them using SAMtools. This process is tedious and error-prone. In particular, when working with many columns of data, mix-ups are common and the bit field containing the flags is unintuitive. There are several libraries for reading BAM files, such as Bio-SamTools for Perl and pysam for Python. Both allow access to the BAM's read information and can filter reads, but require substantial boilerplate code; this is high overhead for mostly ad hoc filtering. We have created a query language that gathers reads using a collection of predicates and common logical connectives. Queries run faster than equivalents and can be compiled to native code for embedding in larger programs. BAMQL provides a user-friendly, powerful and performant way to extract subsets of BAM files for ad hoc analyses or integration into applications. The query language provides a collection of predicates beyond those in SAMtools, and more flexible connectives. |
X Demographics
Geographical breakdown
Country | Count | As % |
---|---|---|
United States | 2 | 20% |
Germany | 1 | 10% |
Canada | 1 | 10% |
Australia | 1 | 10% |
Unknown | 5 | 50% |
Demographic breakdown
Type | Count | As % |
---|---|---|
Scientists | 7 | 70% |
Members of the public | 2 | 20% |
Practitioners (doctors, other healthcare professionals) | 1 | 10% |
Mendeley readers
Geographical breakdown
Country | Count | As % |
---|---|---|
United States | 2 | 4% |
United Kingdom | 1 | 2% |
Switzerland | 1 | 2% |
Unknown | 50 | 93% |
Demographic breakdown
Readers by professional status | Count | As % |
---|---|---|
Researcher | 13 | 24% |
Student > Master | 11 | 20% |
Student > Bachelor | 7 | 13% |
Student > Ph. D. Student | 5 | 9% |
Student > Doctoral Student | 3 | 6% |
Other | 4 | 7% |
Unknown | 11 | 20% |
Readers by discipline | Count | As % |
---|---|---|
Biochemistry, Genetics and Molecular Biology | 18 | 33% |
Agricultural and Biological Sciences | 10 | 19% |
Computer Science | 7 | 13% |
Medicine and Dentistry | 4 | 7% |
Mathematics | 1 | 2% |
Other | 2 | 4% |
Unknown | 12 | 22% |