Comparison of Genotype Imputation for SNP Array and Low-Coverage Whole-Genome Sequencing Data

dc.citation.volume12
dc.contributor.authorDeng T
dc.contributor.authorZhang P
dc.contributor.authorGarrick D
dc.contributor.authorGao H
dc.contributor.authorWang L
dc.contributor.authorZhao F
dc.contributor.editorLingzhao F
dc.date.accessioned2024-04-10T22:48:26Z
dc.date.accessioned2024-07-25T06:46:49Z
dc.date.available2022-01-03
dc.date.available2024-04-10T22:48:26Z
dc.date.available2024-07-25T06:46:49Z
dc.date.issued2022-01-03
dc.description.abstractGenotype imputation is the term used to describe the process of inferring unobserved genotypes in a sample of individuals. It is a key step prior to a genome-wide association study (GWAS) or genomic prediction. The imputation accuracy will directly influence the results from subsequent analyses. In this simulation-based study, we investigate the accuracy of genotype imputation in relation to some factors characterizing SNP chip or low-coverage whole-genome sequencing (LCWGS) data. The factors included the imputation reference population size, the proportion of target markers /SNP density, the genetic relationship (distance) between the target population and the reference population, and the imputation method. Simulations of genotypes were based on coalescence theory accounting for the demographic history of pigs. A population of simulated founders diverged to produce four separate but related populations of descendants. The genomic data of 20,000 individuals were simulated for a 10-Mb chromosome fragment. Our results showed that the proportion of target markers or SNP density was the most critical factor affecting imputation accuracy under all imputation situations. Compared with Minimac4, Beagle5.1 reproduced higher-accuracy imputed data in most cases, more notably when imputing from the LCWGS data. Compared with SNP chip data, LCWGS provided more accurate genotype imputation. Our findings provided a relatively comprehensive insight into the accuracy of genotype imputation in a realistic population of domestic animals.
dc.description.confidentialfalse
dc.identifier.citationDeng T, Zhang P, Garrick D, Gao H, Wang L, Zhao F. (2022). Comparison of Genotype Imputation for SNP Array and Low-Coverage Whole-Genome Sequencing Data. Frontiers in Genetics. 12.
dc.identifier.doi10.3389/fgene.2021.704118
dc.identifier.eissn1664-8021
dc.identifier.elements-typejournal-article
dc.identifier.number704118
dc.identifier.urihttps://mro.massey.ac.nz/handle/10179/70854
dc.languageEnglish
dc.publisherFrontiers Media S A
dc.publisher.urihttps://www.frontiersin.org/journals/genetics/articles/10.3389/fgene.2021.704118/full
dc.relation.isPartOfFrontiers in Genetics
dc.rights(c) 2022 The Author/s
dc.rightsCC BY 4.0
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/
dc.subjectgenotype imputation
dc.subjectSNP density
dc.subjectreference population size
dc.subjectimputation accuracy
dc.subjectSNP chip
dc.subjectsequencing
dc.titleComparison of Genotype Imputation for SNP Array and Low-Coverage Whole-Genome Sequencing Data
dc.typeJournal article
pubs.elements-id450894
pubs.organisational-groupOther
Files
Original bundle
Now showing 1 - 5 of 5
Loading...
Thumbnail Image
Name:
Published version
Size:
1.55 MB
Format:
Adobe Portable Document Format
Description:
Loading...
Thumbnail Image
Name:
Evidence 1
Size:
329.91 KB
Format:
Adobe Portable Document Format
Description:
Loading...
Thumbnail Image
Name:
Evidence 2
Size:
329.91 KB
Format:
Adobe Portable Document Format
Description:
Loading...
Thumbnail Image
Name:
Evidence 3
Size:
329.91 KB
Format:
Tag Image File Format
Description:
Loading...
Thumbnail Image
Name:
Evidence 4
Size:
329.91 KB
Format:
Tag Image File Format
Description:
Collections