Title |
QualComp: a new lossy compressor for quality scores based on rate distortion theory
|
---|---|
Published in |
BMC Bioinformatics, June 2013
|
DOI | 10.1186/1471-2105-14-187 |
Pubmed ID | |
Authors |
Idoia Ochoa, Himanshu Asnani, Dinesh Bharadia, Mainak Chowdhury, Tsachy Weissman, Golan Yona |
Abstract |
Next Generation Sequencing technologies have revolutionized many fields in biology by reducing the time and cost required for sequencing. As a result, large amounts of sequencing data are being generated. A typical sequencing data file may occupy tens or even hundreds of gigabytes of disk space, prohibitively large for many users. This data consists of both the nucleotide sequences and per-base quality scores that indicate the level of confidence in the readout of these sequences. Quality scores account for about half of the required disk space in the commonly used FASTQ format (before compression), and therefore the compression of the quality scores can significantly reduce storage requirements and speed up analysis and transmission of sequencing data. |
X Demographics
Geographical breakdown
Country | Count | As % |
---|---|---|
United States | 3 | 60% |
Canada | 1 | 20% |
Norway | 1 | 20% |
Demographic breakdown
Type | Count | As % |
---|---|---|
Members of the public | 3 | 60% |
Scientists | 2 | 40% |
Mendeley readers
Geographical breakdown
Country | Count | As % |
---|---|---|
United States | 3 | 7% |
Netherlands | 2 | 5% |
Brazil | 1 | 2% |
Unknown | 36 | 86% |
Demographic breakdown
Readers by professional status | Count | As % |
---|---|---|
Student > Ph. D. Student | 14 | 33% |
Researcher | 7 | 17% |
Student > Bachelor | 4 | 10% |
Student > Doctoral Student | 3 | 7% |
Other | 3 | 7% |
Other | 7 | 17% |
Unknown | 4 | 10% |
Readers by discipline | Count | As % |
---|---|---|
Agricultural and Biological Sciences | 13 | 31% |
Computer Science | 10 | 24% |
Engineering | 9 | 21% |
Arts and Humanities | 1 | 2% |
Physics and Astronomy | 1 | 2% |
Other | 3 | 7% |
Unknown | 5 | 12% |