Title |
Hybrid assembly with long and short reads improves discovery of gene family expansions
|
---|---|
Published in |
BMC Genomics, July 2017
|
DOI | 10.1186/s12864-017-3927-8 |
Pubmed ID | |
Authors |
Jason R. Miller, Peng Zhou, Joann Mudge, James Gurtowski, Hayan Lee, Thiruvarangan Ramaraj, Brian P. Walenz, Junqi Liu, Robert M. Stupar, Roxanne Denny, Li Song, Namrata Singh, Lyza G. Maron, Susan R. McCouch, W. Richard McCombie, Michael C. Schatz, Peter Tiffin, Nevin D. Young, Kevin A. T. Silverstein |
Abstract |
Long-read and short-read sequencing technologies offer competing advantages for eukaryotic genome sequencing projects. Combinations of both may be appropriate for surveys of within-species genomic variation. We developed a hybrid assembly pipeline called "Alpaca" that can operate on 20X long-read coverage plus about 50X short-insert and 50X long-insert short-read coverage. To preclude collapse of tandem repeats, Alpaca relies on base-call-corrected long reads for contig formation. Compared to two other assembly protocols, Alpaca demonstrated the most reference agreement and repeat capture on the rice genome. On three accessions of the model legume Medicago truncatula, Alpaca generated the most agreement to a conspecific reference and predicted tandemly repeated genes absent from the other assemblies. Our results suggest Alpaca is a useful tool for investigating structural and copy number variation within de novo assemblies of sampled populations. |
X Demographics
Geographical breakdown
Country | Count | As % |
---|---|---|
United States | 6 | 17% |
Germany | 5 | 14% |
Netherlands | 4 | 11% |
Spain | 2 | 6% |
India | 1 | 3% |
United Kingdom | 1 | 3% |
Canada | 1 | 3% |
Japan | 1 | 3% |
South Africa | 1 | 3% |
Other | 1 | 3% |
Unknown | 13 | 36% |
Demographic breakdown
Type | Count | As % |
---|---|---|
Scientists | 26 | 72% |
Members of the public | 10 | 28% |
Mendeley readers
Geographical breakdown
Country | Count | As % |
---|---|---|
Unknown | 155 | 100% |
Demographic breakdown
Readers by professional status | Count | As % |
---|---|---|
Researcher | 33 | 21% |
Student > Ph. D. Student | 29 | 19% |
Student > Master | 24 | 15% |
Student > Bachelor | 15 | 10% |
Student > Doctoral Student | 7 | 5% |
Other | 21 | 14% |
Unknown | 26 | 17% |
Readers by discipline | Count | As % |
---|---|---|
Agricultural and Biological Sciences | 54 | 35% |
Biochemistry, Genetics and Molecular Biology | 40 | 26% |
Computer Science | 9 | 6% |
Environmental Science | 4 | 3% |
Immunology and Microbiology | 3 | 2% |
Other | 14 | 9% |
Unknown | 31 | 20% |