Eintrag weiter verarbeiten
Assessing single nucleotide variant detection and genotype calling on whole-genome sequenced individuals
Gespeichert in:
Zeitschriftentitel: | Bioinformatics |
---|---|
Personen und Körperschaften: | , , |
In: | Bioinformatics, 30, 2014, 12, S. 1707-1713 |
Format: | E-Article |
Sprache: | Englisch |
veröffentlicht: |
Oxford University Press (OUP)
|
Schlagwörter: |
author_facet |
Cheng, Anthony Youzhi Teo, Yik-Ying Ong, Rick Twee-Hee Cheng, Anthony Youzhi Teo, Yik-Ying Ong, Rick Twee-Hee |
---|---|
author |
Cheng, Anthony Youzhi Teo, Yik-Ying Ong, Rick Twee-Hee |
spellingShingle |
Cheng, Anthony Youzhi Teo, Yik-Ying Ong, Rick Twee-Hee Bioinformatics Assessing single nucleotide variant detection and genotype calling on whole-genome sequenced individuals Computational Mathematics Computational Theory and Mathematics Computer Science Applications Molecular Biology Biochemistry Statistics and Probability |
author_sort |
cheng, anthony youzhi |
spelling |
Cheng, Anthony Youzhi Teo, Yik-Ying Ong, Rick Twee-Hee 1367-4811 1367-4803 Oxford University Press (OUP) Computational Mathematics Computational Theory and Mathematics Computer Science Applications Molecular Biology Biochemistry Statistics and Probability http://dx.doi.org/10.1093/bioinformatics/btu067 <jats:title>Abstract</jats:title> <jats:p>Motivation: Whole-genome sequencing (WGS) is now routinely used for the detection and identification of genetic variants, particularly single nucleotide polymorphisms (SNPs) in humans, and this has provided valuable new insights into human diversity, population histories and genetic association studies of traits and diseases. However, this relies on accurate detection and genotyping calling of the polymorphisms present in the samples sequenced. To minimize cost, the majority of current WGS studies, including the 1000 Genomes Project (1 KGP) have adopted low coverage sequencing of large number of samples, where such designs have inadvertently influenced the development of variant calling methods on WGS data. Assessment of variant accuracy are usually performed on the same set of low coverage individuals or a smaller number of deeply sequenced individuals. It is thus unclear how these variant calling methods would fare for a dataset of ∼100 samples from a population not part of the 1 KGP that have been sequenced at various coverage depths.</jats:p> <jats:p>Results: Using down-sampling of the sequencing reads obtained from the Singapore Sequencing Malay Project (SSMP), and a set of SNP calls from the same individuals genotyped on the Illumina Omni1-Quad array, we assessed the sensitivity of SNP detection, accuracy of genotype calls made and variant accuracy for six commonly used variant calling methods of GATK, SAMtools, Consensus Assessment of Sequence and Variation (CASAVA), VarScan, glfTools and SOAPsnp. The results indicate that at 5× coverage depth, the multi-sample callers of GATK and SAMtools yield the best accuracy particularly if the study samples are called together with a large number of individuals such as those from 1000 Genomes Project. If study samples are sequenced at a high coverage depth such as 30×, CASAVA has the highest variant accuracy as compared with the other variant callers assessed.</jats:p> <jats:p>Availability and implementation: </jats:p> <jats:p>Contact: twee_hee_ong@nuhs.edu.sg</jats:p> <jats:p>Supplementary information: Supplementary data are available at Bioinformatics online.</jats:p> Assessing single nucleotide variant detection and genotype calling on whole-genome sequenced individuals Bioinformatics |
doi_str_mv |
10.1093/bioinformatics/btu067 |
facet_avail |
Online Free |
finc_class_facet |
Mathematik Informatik Biologie Chemie und Pharmazie |
format |
ElectronicArticle |
fullrecord |
blob:ai-49-aHR0cDovL2R4LmRvaS5vcmcvMTAuMTA5My9iaW9pbmZvcm1hdGljcy9idHUwNjc |
id |
ai-49-aHR0cDovL2R4LmRvaS5vcmcvMTAuMTA5My9iaW9pbmZvcm1hdGljcy9idHUwNjc |
institution |
DE-Ch1 DE-L229 DE-D275 DE-Bn3 DE-Brt1 DE-Zwi2 DE-D161 DE-Gla1 DE-Zi4 DE-15 DE-Pl11 DE-Rs1 DE-105 DE-14 |
imprint |
Oxford University Press (OUP), 2014 |
imprint_str_mv |
Oxford University Press (OUP), 2014 |
issn |
1367-4811 1367-4803 |
issn_str_mv |
1367-4811 1367-4803 |
language |
English |
mega_collection |
Oxford University Press (OUP) (CrossRef) |
match_str |
cheng2014assessingsinglenucleotidevariantdetectionandgenotypecallingonwholegenomesequencedindividuals |
publishDateSort |
2014 |
publisher |
Oxford University Press (OUP) |
recordtype |
ai |
record_format |
ai |
series |
Bioinformatics |
source_id |
49 |
title |
Assessing single nucleotide variant detection and genotype calling on whole-genome sequenced individuals |
title_unstemmed |
Assessing single nucleotide variant detection and genotype calling on whole-genome sequenced individuals |
title_full |
Assessing single nucleotide variant detection and genotype calling on whole-genome sequenced individuals |
title_fullStr |
Assessing single nucleotide variant detection and genotype calling on whole-genome sequenced individuals |
title_full_unstemmed |
Assessing single nucleotide variant detection and genotype calling on whole-genome sequenced individuals |
title_short |
Assessing single nucleotide variant detection and genotype calling on whole-genome sequenced individuals |
title_sort |
assessing single nucleotide variant detection and genotype calling on whole-genome sequenced individuals |
topic |
Computational Mathematics Computational Theory and Mathematics Computer Science Applications Molecular Biology Biochemistry Statistics and Probability |
url |
http://dx.doi.org/10.1093/bioinformatics/btu067 |
publishDate |
2014 |
physical |
1707-1713 |
description |
<jats:title>Abstract</jats:title>
<jats:p>Motivation: Whole-genome sequencing (WGS) is now routinely used for the detection and identification of genetic variants, particularly single nucleotide polymorphisms (SNPs) in humans, and this has provided valuable new insights into human diversity, population histories and genetic association studies of traits and diseases. However, this relies on accurate detection and genotyping calling of the polymorphisms present in the samples sequenced. To minimize cost, the majority of current WGS studies, including the 1000 Genomes Project (1 KGP) have adopted low coverage sequencing of large number of samples, where such designs have inadvertently influenced the development of variant calling methods on WGS data. Assessment of variant accuracy are usually performed on the same set of low coverage individuals or a smaller number of deeply sequenced individuals. It is thus unclear how these variant calling methods would fare for a dataset of ∼100 samples from a population not part of the 1 KGP that have been sequenced at various coverage depths.</jats:p>
<jats:p>Results: Using down-sampling of the sequencing reads obtained from the Singapore Sequencing Malay Project (SSMP), and a set of SNP calls from the same individuals genotyped on the Illumina Omni1-Quad array, we assessed the sensitivity of SNP detection, accuracy of genotype calls made and variant accuracy for six commonly used variant calling methods of GATK, SAMtools, Consensus Assessment of Sequence and Variation (CASAVA), VarScan, glfTools and SOAPsnp. The results indicate that at 5× coverage depth, the multi-sample callers of GATK and SAMtools yield the best accuracy particularly if the study samples are called together with a large number of individuals such as those from 1000 Genomes Project. If study samples are sequenced at a high coverage depth such as 30×, CASAVA has the highest variant accuracy as compared with the other variant callers assessed.</jats:p>
<jats:p>Availability and implementation: </jats:p>
<jats:p>Contact: twee_hee_ong@nuhs.edu.sg</jats:p>
<jats:p>Supplementary information: Supplementary data are available at Bioinformatics online.</jats:p> |
container_issue |
12 |
container_start_page |
1707 |
container_title |
Bioinformatics |
container_volume |
30 |
format_de105 |
Article, E-Article |
format_de14 |
Article, E-Article |
format_de15 |
Article, E-Article |
format_de520 |
Article, E-Article |
format_de540 |
Article, E-Article |
format_dech1 |
Article, E-Article |
format_ded117 |
Article, E-Article |
format_degla1 |
E-Article |
format_del152 |
Buch |
format_del189 |
Article, E-Article |
format_dezi4 |
Article |
format_dezwi2 |
Article, E-Article |
format_finc |
Article, E-Article |
format_nrw |
Article, E-Article |
_version_ |
1792341885809852418 |
geogr_code |
not assigned |
last_indexed |
2024-03-01T16:27:00.552Z |
geogr_code_person |
not assigned |
openURL |
url_ver=Z39.88-2004&ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fvufind.svn.sourceforge.net%3Agenerator&rft.title=Assessing+single+nucleotide+variant+detection+and+genotype+calling+on+whole-genome+sequenced+individuals&rft.date=2014-06-15&genre=article&issn=1367-4803&volume=30&issue=12&spage=1707&epage=1713&pages=1707-1713&jtitle=Bioinformatics&atitle=Assessing+single+nucleotide+variant+detection+and+genotype+calling+on+whole-genome+sequenced+individuals&aulast=Ong&aufirst=Rick+Twee-Hee&rft_id=info%3Adoi%2F10.1093%2Fbioinformatics%2Fbtu067&rft.language%5B0%5D=eng |
SOLR | |
_version_ | 1792341885809852418 |
author | Cheng, Anthony Youzhi, Teo, Yik-Ying, Ong, Rick Twee-Hee |
author_facet | Cheng, Anthony Youzhi, Teo, Yik-Ying, Ong, Rick Twee-Hee, Cheng, Anthony Youzhi, Teo, Yik-Ying, Ong, Rick Twee-Hee |
author_sort | cheng, anthony youzhi |
container_issue | 12 |
container_start_page | 1707 |
container_title | Bioinformatics |
container_volume | 30 |
description | <jats:title>Abstract</jats:title> <jats:p>Motivation: Whole-genome sequencing (WGS) is now routinely used for the detection and identification of genetic variants, particularly single nucleotide polymorphisms (SNPs) in humans, and this has provided valuable new insights into human diversity, population histories and genetic association studies of traits and diseases. However, this relies on accurate detection and genotyping calling of the polymorphisms present in the samples sequenced. To minimize cost, the majority of current WGS studies, including the 1000 Genomes Project (1 KGP) have adopted low coverage sequencing of large number of samples, where such designs have inadvertently influenced the development of variant calling methods on WGS data. Assessment of variant accuracy are usually performed on the same set of low coverage individuals or a smaller number of deeply sequenced individuals. It is thus unclear how these variant calling methods would fare for a dataset of ∼100 samples from a population not part of the 1 KGP that have been sequenced at various coverage depths.</jats:p> <jats:p>Results: Using down-sampling of the sequencing reads obtained from the Singapore Sequencing Malay Project (SSMP), and a set of SNP calls from the same individuals genotyped on the Illumina Omni1-Quad array, we assessed the sensitivity of SNP detection, accuracy of genotype calls made and variant accuracy for six commonly used variant calling methods of GATK, SAMtools, Consensus Assessment of Sequence and Variation (CASAVA), VarScan, glfTools and SOAPsnp. The results indicate that at 5× coverage depth, the multi-sample callers of GATK and SAMtools yield the best accuracy particularly if the study samples are called together with a large number of individuals such as those from 1000 Genomes Project. If study samples are sequenced at a high coverage depth such as 30×, CASAVA has the highest variant accuracy as compared with the other variant callers assessed.</jats:p> <jats:p>Availability and implementation: </jats:p> <jats:p>Contact: twee_hee_ong@nuhs.edu.sg</jats:p> <jats:p>Supplementary information: Supplementary data are available at Bioinformatics online.</jats:p> |
doi_str_mv | 10.1093/bioinformatics/btu067 |
facet_avail | Online, Free |
finc_class_facet | Mathematik, Informatik, Biologie, Chemie und Pharmazie |
format | ElectronicArticle |
format_de105 | Article, E-Article |
format_de14 | Article, E-Article |
format_de15 | Article, E-Article |
format_de520 | Article, E-Article |
format_de540 | Article, E-Article |
format_dech1 | Article, E-Article |
format_ded117 | Article, E-Article |
format_degla1 | E-Article |
format_del152 | Buch |
format_del189 | Article, E-Article |
format_dezi4 | Article |
format_dezwi2 | Article, E-Article |
format_finc | Article, E-Article |
format_nrw | Article, E-Article |
geogr_code | not assigned |
geogr_code_person | not assigned |
id | ai-49-aHR0cDovL2R4LmRvaS5vcmcvMTAuMTA5My9iaW9pbmZvcm1hdGljcy9idHUwNjc |
imprint | Oxford University Press (OUP), 2014 |
imprint_str_mv | Oxford University Press (OUP), 2014 |
institution | DE-Ch1, DE-L229, DE-D275, DE-Bn3, DE-Brt1, DE-Zwi2, DE-D161, DE-Gla1, DE-Zi4, DE-15, DE-Pl11, DE-Rs1, DE-105, DE-14 |
issn | 1367-4811, 1367-4803 |
issn_str_mv | 1367-4811, 1367-4803 |
language | English |
last_indexed | 2024-03-01T16:27:00.552Z |
match_str | cheng2014assessingsinglenucleotidevariantdetectionandgenotypecallingonwholegenomesequencedindividuals |
mega_collection | Oxford University Press (OUP) (CrossRef) |
physical | 1707-1713 |
publishDate | 2014 |
publishDateSort | 2014 |
publisher | Oxford University Press (OUP) |
record_format | ai |
recordtype | ai |
series | Bioinformatics |
source_id | 49 |
spelling | Cheng, Anthony Youzhi Teo, Yik-Ying Ong, Rick Twee-Hee 1367-4811 1367-4803 Oxford University Press (OUP) Computational Mathematics Computational Theory and Mathematics Computer Science Applications Molecular Biology Biochemistry Statistics and Probability http://dx.doi.org/10.1093/bioinformatics/btu067 <jats:title>Abstract</jats:title> <jats:p>Motivation: Whole-genome sequencing (WGS) is now routinely used for the detection and identification of genetic variants, particularly single nucleotide polymorphisms (SNPs) in humans, and this has provided valuable new insights into human diversity, population histories and genetic association studies of traits and diseases. However, this relies on accurate detection and genotyping calling of the polymorphisms present in the samples sequenced. To minimize cost, the majority of current WGS studies, including the 1000 Genomes Project (1 KGP) have adopted low coverage sequencing of large number of samples, where such designs have inadvertently influenced the development of variant calling methods on WGS data. Assessment of variant accuracy are usually performed on the same set of low coverage individuals or a smaller number of deeply sequenced individuals. It is thus unclear how these variant calling methods would fare for a dataset of ∼100 samples from a population not part of the 1 KGP that have been sequenced at various coverage depths.</jats:p> <jats:p>Results: Using down-sampling of the sequencing reads obtained from the Singapore Sequencing Malay Project (SSMP), and a set of SNP calls from the same individuals genotyped on the Illumina Omni1-Quad array, we assessed the sensitivity of SNP detection, accuracy of genotype calls made and variant accuracy for six commonly used variant calling methods of GATK, SAMtools, Consensus Assessment of Sequence and Variation (CASAVA), VarScan, glfTools and SOAPsnp. The results indicate that at 5× coverage depth, the multi-sample callers of GATK and SAMtools yield the best accuracy particularly if the study samples are called together with a large number of individuals such as those from 1000 Genomes Project. If study samples are sequenced at a high coverage depth such as 30×, CASAVA has the highest variant accuracy as compared with the other variant callers assessed.</jats:p> <jats:p>Availability and implementation: </jats:p> <jats:p>Contact: twee_hee_ong@nuhs.edu.sg</jats:p> <jats:p>Supplementary information: Supplementary data are available at Bioinformatics online.</jats:p> Assessing single nucleotide variant detection and genotype calling on whole-genome sequenced individuals Bioinformatics |
spellingShingle | Cheng, Anthony Youzhi, Teo, Yik-Ying, Ong, Rick Twee-Hee, Bioinformatics, Assessing single nucleotide variant detection and genotype calling on whole-genome sequenced individuals, Computational Mathematics, Computational Theory and Mathematics, Computer Science Applications, Molecular Biology, Biochemistry, Statistics and Probability |
title | Assessing single nucleotide variant detection and genotype calling on whole-genome sequenced individuals |
title_full | Assessing single nucleotide variant detection and genotype calling on whole-genome sequenced individuals |
title_fullStr | Assessing single nucleotide variant detection and genotype calling on whole-genome sequenced individuals |
title_full_unstemmed | Assessing single nucleotide variant detection and genotype calling on whole-genome sequenced individuals |
title_short | Assessing single nucleotide variant detection and genotype calling on whole-genome sequenced individuals |
title_sort | assessing single nucleotide variant detection and genotype calling on whole-genome sequenced individuals |
title_unstemmed | Assessing single nucleotide variant detection and genotype calling on whole-genome sequenced individuals |
topic | Computational Mathematics, Computational Theory and Mathematics, Computer Science Applications, Molecular Biology, Biochemistry, Statistics and Probability |
url | http://dx.doi.org/10.1093/bioinformatics/btu067 |