Eintrag weiter verarbeiten
An efficient and scalable analysis framework for variant extraction and refinement from population-scale DNA sequence data
Gespeichert in:
Zeitschriftentitel: | Genome Research |
---|---|
Personen und Körperschaften: | , , , |
In: | Genome Research, 25, 2015, 6, S. 918-925 |
Format: | E-Article |
Sprache: | Englisch |
veröffentlicht: |
Cold Spring Harbor Laboratory
|
Schlagwörter: |
author_facet |
Jun, Goo Wing, Mary Kate Abecasis, Gonçalo R. Kang, Hyun Min Jun, Goo Wing, Mary Kate Abecasis, Gonçalo R. Kang, Hyun Min |
---|---|
author |
Jun, Goo Wing, Mary Kate Abecasis, Gonçalo R. Kang, Hyun Min |
spellingShingle |
Jun, Goo Wing, Mary Kate Abecasis, Gonçalo R. Kang, Hyun Min Genome Research An efficient and scalable analysis framework for variant extraction and refinement from population-scale DNA sequence data Genetics (clinical) Genetics |
author_sort |
jun, goo |
spelling |
Jun, Goo Wing, Mary Kate Abecasis, Gonçalo R. Kang, Hyun Min 1088-9051 1549-5469 Cold Spring Harbor Laboratory Genetics (clinical) Genetics http://dx.doi.org/10.1101/gr.176552.114 <jats:p>The analysis of next-generation sequencing data is computationally and statistically challenging because of the massive volume of data and imperfect data quality. We present GotCloud, a pipeline for efficiently detecting and genotyping high-quality variants from large-scale sequencing data. GotCloud automates sequence alignment, sample-level quality control, variant calling, filtering of likely artifacts using machine-learning techniques, and genotype refinement using haplotype information. The pipeline can process thousands of samples in parallel and requires less computational resources than current alternatives. Experiments with whole-genome and exome-targeted sequence data generated by the 1000 Genomes Project show that the pipeline provides effective filtering against false positive variants and high power to detect true variants. Our pipeline has already contributed to variant detection and genotyping in several large-scale sequencing projects, including the 1000 Genomes Project and the NHLBI Exome Sequencing Project. We hope it will now prove useful to many medical sequencing studies.</jats:p> An efficient and scalable analysis framework for variant extraction and refinement from population-scale DNA sequence data Genome Research |
doi_str_mv |
10.1101/gr.176552.114 |
facet_avail |
Online Free |
finc_class_facet |
Biologie |
format |
ElectronicArticle |
fullrecord |
blob:ai-49-aHR0cDovL2R4LmRvaS5vcmcvMTAuMTEwMS9nci4xNzY1NTIuMTE0 |
id |
ai-49-aHR0cDovL2R4LmRvaS5vcmcvMTAuMTEwMS9nci4xNzY1NTIuMTE0 |
institution |
DE-15 DE-Pl11 DE-Rs1 DE-105 DE-14 DE-Ch1 DE-L229 DE-D275 DE-Bn3 DE-Brt1 DE-Zwi2 DE-D161 DE-Gla1 DE-Zi4 |
imprint |
Cold Spring Harbor Laboratory, 2015 |
imprint_str_mv |
Cold Spring Harbor Laboratory, 2015 |
issn |
1088-9051 1549-5469 |
issn_str_mv |
1088-9051 1549-5469 |
language |
English |
mega_collection |
Cold Spring Harbor Laboratory (CrossRef) |
match_str |
jun2015anefficientandscalableanalysisframeworkforvariantextractionandrefinementfrompopulationscalednasequencedata |
publishDateSort |
2015 |
publisher |
Cold Spring Harbor Laboratory |
recordtype |
ai |
record_format |
ai |
series |
Genome Research |
source_id |
49 |
title |
An efficient and scalable analysis framework for variant extraction and refinement from population-scale DNA sequence data |
title_unstemmed |
An efficient and scalable analysis framework for variant extraction and refinement from population-scale DNA sequence data |
title_full |
An efficient and scalable analysis framework for variant extraction and refinement from population-scale DNA sequence data |
title_fullStr |
An efficient and scalable analysis framework for variant extraction and refinement from population-scale DNA sequence data |
title_full_unstemmed |
An efficient and scalable analysis framework for variant extraction and refinement from population-scale DNA sequence data |
title_short |
An efficient and scalable analysis framework for variant extraction and refinement from population-scale DNA sequence data |
title_sort |
an efficient and scalable analysis framework for variant extraction and refinement from population-scale dna sequence data |
topic |
Genetics (clinical) Genetics |
url |
http://dx.doi.org/10.1101/gr.176552.114 |
publishDate |
2015 |
physical |
918-925 |
description |
<jats:p>The analysis of next-generation sequencing data is computationally and statistically challenging because of the massive volume of data and imperfect data quality. We present GotCloud, a pipeline for efficiently detecting and genotyping high-quality variants from large-scale sequencing data. GotCloud automates sequence alignment, sample-level quality control, variant calling, filtering of likely artifacts using machine-learning techniques, and genotype refinement using haplotype information. The pipeline can process thousands of samples in parallel and requires less computational resources than current alternatives. Experiments with whole-genome and exome-targeted sequence data generated by the 1000 Genomes Project show that the pipeline provides effective filtering against false positive variants and high power to detect true variants. Our pipeline has already contributed to variant detection and genotyping in several large-scale sequencing projects, including the 1000 Genomes Project and the NHLBI Exome Sequencing Project. We hope it will now prove useful to many medical sequencing studies.</jats:p> |
container_issue |
6 |
container_start_page |
918 |
container_title |
Genome Research |
container_volume |
25 |
format_de105 |
Article, E-Article |
format_de14 |
Article, E-Article |
format_de15 |
Article, E-Article |
format_de520 |
Article, E-Article |
format_de540 |
Article, E-Article |
format_dech1 |
Article, E-Article |
format_ded117 |
Article, E-Article |
format_degla1 |
E-Article |
format_del152 |
Buch |
format_del189 |
Article, E-Article |
format_dezi4 |
Article |
format_dezwi2 |
Article, E-Article |
format_finc |
Article, E-Article |
format_nrw |
Article, E-Article |
_version_ |
1792347830387474434 |
geogr_code |
not assigned |
last_indexed |
2024-03-01T18:01:00.698Z |
geogr_code_person |
not assigned |
openURL |
url_ver=Z39.88-2004&ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fvufind.svn.sourceforge.net%3Agenerator&rft.title=An+efficient+and+scalable+analysis+framework+for+variant+extraction+and+refinement+from+population-scale+DNA+sequence+data&rft.date=2015-06-01&genre=article&issn=1549-5469&volume=25&issue=6&spage=918&epage=925&pages=918-925&jtitle=Genome+Research&atitle=An+efficient+and+scalable+analysis+framework+for+variant+extraction+and+refinement+from+population-scale+DNA+sequence+data&aulast=Kang&aufirst=Hyun+Min&rft_id=info%3Adoi%2F10.1101%2Fgr.176552.114&rft.language%5B0%5D=eng |
SOLR | |
_version_ | 1792347830387474434 |
author | Jun, Goo, Wing, Mary Kate, Abecasis, Gonçalo R., Kang, Hyun Min |
author_facet | Jun, Goo, Wing, Mary Kate, Abecasis, Gonçalo R., Kang, Hyun Min, Jun, Goo, Wing, Mary Kate, Abecasis, Gonçalo R., Kang, Hyun Min |
author_sort | jun, goo |
container_issue | 6 |
container_start_page | 918 |
container_title | Genome Research |
container_volume | 25 |
description | <jats:p>The analysis of next-generation sequencing data is computationally and statistically challenging because of the massive volume of data and imperfect data quality. We present GotCloud, a pipeline for efficiently detecting and genotyping high-quality variants from large-scale sequencing data. GotCloud automates sequence alignment, sample-level quality control, variant calling, filtering of likely artifacts using machine-learning techniques, and genotype refinement using haplotype information. The pipeline can process thousands of samples in parallel and requires less computational resources than current alternatives. Experiments with whole-genome and exome-targeted sequence data generated by the 1000 Genomes Project show that the pipeline provides effective filtering against false positive variants and high power to detect true variants. Our pipeline has already contributed to variant detection and genotyping in several large-scale sequencing projects, including the 1000 Genomes Project and the NHLBI Exome Sequencing Project. We hope it will now prove useful to many medical sequencing studies.</jats:p> |
doi_str_mv | 10.1101/gr.176552.114 |
facet_avail | Online, Free |
finc_class_facet | Biologie |
format | ElectronicArticle |
format_de105 | Article, E-Article |
format_de14 | Article, E-Article |
format_de15 | Article, E-Article |
format_de520 | Article, E-Article |
format_de540 | Article, E-Article |
format_dech1 | Article, E-Article |
format_ded117 | Article, E-Article |
format_degla1 | E-Article |
format_del152 | Buch |
format_del189 | Article, E-Article |
format_dezi4 | Article |
format_dezwi2 | Article, E-Article |
format_finc | Article, E-Article |
format_nrw | Article, E-Article |
geogr_code | not assigned |
geogr_code_person | not assigned |
id | ai-49-aHR0cDovL2R4LmRvaS5vcmcvMTAuMTEwMS9nci4xNzY1NTIuMTE0 |
imprint | Cold Spring Harbor Laboratory, 2015 |
imprint_str_mv | Cold Spring Harbor Laboratory, 2015 |
institution | DE-15, DE-Pl11, DE-Rs1, DE-105, DE-14, DE-Ch1, DE-L229, DE-D275, DE-Bn3, DE-Brt1, DE-Zwi2, DE-D161, DE-Gla1, DE-Zi4 |
issn | 1088-9051, 1549-5469 |
issn_str_mv | 1088-9051, 1549-5469 |
language | English |
last_indexed | 2024-03-01T18:01:00.698Z |
match_str | jun2015anefficientandscalableanalysisframeworkforvariantextractionandrefinementfrompopulationscalednasequencedata |
mega_collection | Cold Spring Harbor Laboratory (CrossRef) |
physical | 918-925 |
publishDate | 2015 |
publishDateSort | 2015 |
publisher | Cold Spring Harbor Laboratory |
record_format | ai |
recordtype | ai |
series | Genome Research |
source_id | 49 |
spelling | Jun, Goo Wing, Mary Kate Abecasis, Gonçalo R. Kang, Hyun Min 1088-9051 1549-5469 Cold Spring Harbor Laboratory Genetics (clinical) Genetics http://dx.doi.org/10.1101/gr.176552.114 <jats:p>The analysis of next-generation sequencing data is computationally and statistically challenging because of the massive volume of data and imperfect data quality. We present GotCloud, a pipeline for efficiently detecting and genotyping high-quality variants from large-scale sequencing data. GotCloud automates sequence alignment, sample-level quality control, variant calling, filtering of likely artifacts using machine-learning techniques, and genotype refinement using haplotype information. The pipeline can process thousands of samples in parallel and requires less computational resources than current alternatives. Experiments with whole-genome and exome-targeted sequence data generated by the 1000 Genomes Project show that the pipeline provides effective filtering against false positive variants and high power to detect true variants. Our pipeline has already contributed to variant detection and genotyping in several large-scale sequencing projects, including the 1000 Genomes Project and the NHLBI Exome Sequencing Project. We hope it will now prove useful to many medical sequencing studies.</jats:p> An efficient and scalable analysis framework for variant extraction and refinement from population-scale DNA sequence data Genome Research |
spellingShingle | Jun, Goo, Wing, Mary Kate, Abecasis, Gonçalo R., Kang, Hyun Min, Genome Research, An efficient and scalable analysis framework for variant extraction and refinement from population-scale DNA sequence data, Genetics (clinical), Genetics |
title | An efficient and scalable analysis framework for variant extraction and refinement from population-scale DNA sequence data |
title_full | An efficient and scalable analysis framework for variant extraction and refinement from population-scale DNA sequence data |
title_fullStr | An efficient and scalable analysis framework for variant extraction and refinement from population-scale DNA sequence data |
title_full_unstemmed | An efficient and scalable analysis framework for variant extraction and refinement from population-scale DNA sequence data |
title_short | An efficient and scalable analysis framework for variant extraction and refinement from population-scale DNA sequence data |
title_sort | an efficient and scalable analysis framework for variant extraction and refinement from population-scale dna sequence data |
title_unstemmed | An efficient and scalable analysis framework for variant extraction and refinement from population-scale DNA sequence data |
topic | Genetics (clinical), Genetics |
url | http://dx.doi.org/10.1101/gr.176552.114 |