skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Genome-Wide Association Study Based on Multiple Imputation with Low-Depth Sequencing Data: Application to Biofuel Traits in Reed Canarygrass

Journal Article · · G3
 [1];  [2];  [2];  [3];  [4];  [5];  [1]
  1. Univ. of Wisconsin, Madison, WI (United States). Dept. of Agronomy
  2. Univ. of Wisconsin, Madison, WI (United States). Inst. for Genomic Diversity
  3. Cornell Univ., Ithaca, NY (United States). Dept. of Plant Breeding and Genetics
  4. Cornell Univ., Ithaca, NY (United States). School of Integrative Plant Science. Soil and Crops Section
  5. US Dept. of Agriculture (USDA), Ithaca, NY (United States). Agricultural Research Service (ARS); US Dept. of Agriculture (USDA), Madison, WI (United States). Agricultural Research Service (ARS)

Genotyping by sequencing allows for large-scale genetic analyses in plant species with no reference genome, but sets the challenge of sound inference in presence of uncertain genotypes. We report an imputation-based genome-wide association study (GWAS) in reed canarygrass (Phalaris arundinacea L., Phalaris caesia Nees), a cool-season grass species with potential as a biofuel crop. Our study involved two linkage populations and an association panel of 590 reed canarygrass genotypes. Plants were assayed for up to 5228 single nucleotide polymorphism markers and 35 traits. The genotypic markers were derived from lowdepth sequencing with 78% missing data on average. To soundly infer marker-trait associations, multiple imputation (MI) was used: several imputes of the marker data were generated to reflect imputation uncertainty and association tests were performed on marker effects across imputes. A total of nine significant markers were identified, three of which showed significant homology with the Brachypodium dystachion genome. Because no physical map of the reed canarygrass genome was available, imputation was conducted using classification trees. In general, MI showed good consistency with the complete-case analysis and adequate control over imputation uncertainty. A gain in significance of marker effects was achieved through MI, but only for rare cases when missing data were ,45%. In addition to providing insight into the genetic basis of important traits in reed canarygrass, this study presents one of the first applications of MI to genome-wide analyses and provides useful guidelines for conducting GWAS based on genotyping-by-sequencing data.

Research Organization:
US Dept. of Agriculture (USDA), Washington, DC (United States). Agricultural Research Service (ARS)
Sponsoring Organization:
USDOE Office of Science (SC)
Grant/Contract Number:
AI02-07ER64454
OSTI ID:
1627953
Journal Information:
G3, Vol. 5, Issue 5; ISSN 2160-1836
Publisher:
Genetics Society of AmericaCopyright Statement
Country of Publication:
United States
Language:
English

References (66)

Lignin Biosynthesis journal June 2003
Imputation methods to improve inference in SNP association studies journal January 2006
Early Trials and Use of Reed Canary Grass as a Forage Plant 1 journal January 1931
Mixed linear model approach adapted for genome-wide association studies journal March 2010
Variance component model to account for sample structure in genome-wide association studies journal March 2010
Genetic Modification of Herbaceous Plants for Feed and Fuel journal January 2001
Genotype and SNP calling from next-generation sequencing data journal May 2011
Phylogeny of the tribe Aveneae (Pooideae, Poaceae) inferred from plastid trnT-F and nuclear ITS sequences journal September 2007
Large multi-gene phylogenetic trees of the grasses (Poaceae): Progress towards complete tribal and generic level sampling journal May 2008
The interacting effects of temperature and plant community type on nutrient removal in wetland microcosms journal June 2005
Switchgrass as a sustainable bioenergy crop journal April 1996
Biomass Yield of Naturalized Populations and Cultivars of Reed Canary Grass journal August 2009
Switchgrass Genomic Diversity, Ploidy, and Evolution: Novel Insights from a Network-Based SNP Discovery Protocol journal January 2013
Gapped BLAST and PSI-BLAST: a new generation of protein database search programs journal September 1997
Genetic Variability in Forage Yield, Crude Protein Percentage, and Palatability in Reed Canarygrass, Phalaris arundinacea L. 1 journal September 1968
Fully conditional specification in multivariate imputation journal December 2006
Recursive partitioning for missing data imputation in the presence of interaction effects journal April 2014
Use of Multiple Imputation in the Epidemiologic Literature journal June 2008
Principal components analysis corrects for stratification in genome-wide association studies journal July 2006
Evaluation of Acid-Insoluble Ash as a Natural Marker in Ruminant Digestibility Studies journal February 1977
Multiple imputation of discrete and continuous data by fully conditional specification journal June 2007
Ridge Regression and Other Kernels for Genomic Selection with R Package rrBLUP journal January 2011
Multiple Imputation for Missing Data via Sequential Regression Trees journal September 2010
A comparison of approaches to account for uncertainty in analysis of imputed genotypes journal January 2011
DNA Polymorphisms Reveal Geographic Races of Reed Canarygrass journal November 2009
Large-Sample Significance Levels from Multiply Imputed Data Using Moment-Based Statistics and an F Reference Distribution journal December 1991
Flexible Imputation of Missing Data book March 2012
Divergent Selection for Secondary Traits in Upland Tetraploid Switchgrass and Effects on Sward Biomass Yield journal September 2013
Genetic evidence suggests a widespread distribution of native North American populations of reed canarygrass journal August 2012
Genetic diversity and population structure of Eurasian populations of reed canarygrass: cytotypes, cultivars, and interspecific hybrids journal January 2011
Quantifying Actual and Theoretical Ethanol Yields for Switchgrass Strains Using NIRS Analyses journal August 2010
Pyrolysis of energy crops including alfalfa stems, reed canarygrass, and eastern gamagrass☆ journal December 2006
Genetic Variability for Biofuel Traits in a Circumglobal Reed Canarygrass Collection journal March 2013
Different plant parts as raw material for fuel and pulp production journal March 2000
Genomic Selection in Wheat Breeding using Genotyping-by-Sequencing journal January 2012
Multiple Imputation for Interval Estimation From Simple Random Samples With Ignorable Nonresponse journal June 1986
Tetraploid and hexaploid chromosome races of Phalaris arundinacea L. journal January 1962
Gramene: a bird's eye view of cereal genomes journal January 2006
Thin plate regression splines journal February 2003
Chemical composition and response to dilute-acid pretreatment and enzymatic saccharification of alfalfa, reed canarygrass, and switchgrass journal October 2006
Imputation of Unordered Markers and the Impact on Genomic Selection Accuracy journal March 2013
Status and Prospects of Association Mapping in Plants journal January 2008
A new multipoint method for genome-wide association studies by imputation of genotypes journal June 2007
Genotyping-by-Sequencing for Plant Breeding and Genetics journal January 2012
Extremely low-coverage sequencing and imputation increases power for genome-wide association studies journal May 2012
Imputation-Based Analysis of Association Studies: Candidate Regions and Quantitative Traits journal July 2007
A unified mixed-model method for association mapping that accounts for multiple levels of relatedness journal December 2005
Miscellanea. Small-sample degrees of freedom with multiple imputation journal December 1999
Multiple imputation for missing data in epidemiological and clinical research: potential and pitfalls journal June 2009
A Robust, Simple Genotyping-by-Sequencing (GBS) Approach for High Diversity Species journal May 2011
Multiple Imputation After 18+ Years journal June 1996
Chemical composition of herbaceous grass and legume species grown for maximum biomass production journal January 1988
Efficient Control of Population Structure in Model Organism Association Mapping journal March 2008
Revision of the genus Phalaris L. (Gramineae) journal January 1995
Statistical significance for genomewide studies journal July 2003
Multiple Imputation of Missing Phenotype Data for QTL Mapping journal January 2011
Practical Issues in Imputation-Based Association Mapping journal December 2008
Landfill Leachate Recirculation: Effects on Vegetation Vigor and Clay Surface Cover Infiltration journal January 1991
Reed Canarygrass and Other Phalaris Species book October 2015
Yield Components of Biomass in Switchgrass journal January 2008
A Two-Stage Technique for the in Vitro Digestion of Forage Crops journal June 1963
Multiple Imputation after 18+ Years journal June 1996
Multiple Imputation for Interval Estimation from Simple Random Samples with Ignorable Nonresponse journal June 1986
Genetic Modification of Herbaceous Plants for Feed and Fuel journal January 2001
Imputation-based analysis of association studies: candidate regions and quantitative traits journal January 2005
Population Definition, Sample Selection, and Calibration Procedures for Near Infrared Reflectance Spectroscopy journal January 1991

Cited By (5)

A reassessment of the genome size–invasiveness relationship in reed canarygrass (Phalaris arundinacea) journal March 2018
Genotyping-by-sequencing provides the discriminating power to investigate the subspecies of Daucus carota (Apiaceae) journal October 2016
Variation in sequences containing microsatellite motifs in the perennial biomass and forage grass, Phalaris arundinacea (Poaceae) journal March 2016
Genome-wide association mapping in winter barley for grain yield and culm cell wall polymer content using the high-throughput CoMPP technique journal March 2017
Association Mapping in Scandinavian Winter Wheat for Yield, Plant Height, and Traits Important for Second-Generation Bioethanol Production journal November 2015