Precision-mapping and statistical validation of quantitative trait loci by machine learning

Bedo, Justin; Wenzl, Peter; Kowalczyk, Adam; Kilian, Andrzej
January 2008
BMC Genetics;2008, Vol. 9, Special section p1
Academic Journal
Background: We introduce a QTL-mapping algorithm based on Statistical Machine Learning (SML) that is conceptually quite different to existing methods as there is a strong focus on generalisation ability. Our approach combines ridge regression, recursive feature elimination, and estimation of generalisation performance and marker effects using bootstrap resampling. Model performance and marker effects are determined using independent testing samples (individuals), thus providing better estimates. We compare the performance of SML against Composite Interval Mapping (CIM), Bayesian Interval Mapping (BIM) and single Marker Regression (MR) on synthetic datasets and a multi-trait and multi-environment dataset of the progeny for a cross between two barley cultivars. Results: In an analysis of the synthetic datasets, SML accurately predicted the number of QTL underlying a trait while BIM tended to underestimate the number of QTL. The QTL identified by SML for the barley dataset broadly coincided with known QTL locations. SML reported approximately half of the QTL reported by either CIM or MR, not unexpected given that neither CIM nor MR incorporates independent testing. The latter makes these two methods susceptible to producing overly optimistic estimates of QTL effects, as we demonstrate for MR. The QTL resolution (peak definition) afforded by SML was consistently superior to MR, CIM and BIM, with QTL detection power similar to BIM. The precision of SML was underscored by repeatedly identifying, at ≤ 1-cM precision, three QTL for four partially related traits (heading date, plant height, lodging and yield). The set of QTL obtained using a 'raw' and a 'curated' version of the same genotypic dataset were more similar to each other for SML than for CIM or MR. Conclusion: The SML algorithm produces better estimates of QTL effects because it eliminates the optimistic bias in the predictive performance of other QTL methods. It produces narrower peaks than other methods (except BIM) and hence identifies QTL with greater precision. It is more robust to genotyping and linkage mapping errors, and identifies markers linked to QTL in the absence of a genetic map.


Related Articles

  • An examination of positive selection and changing effective population size in Angus and Holstein cattle populations (Bos taurus) using a high density SNP genotyping platform and the contribution of ancient polymorphism to genomic diversity in Domestic cattle. MacEachern, Sean; Hayes, Ben; McEwan, John; Goddard, Mike // BMC Genomics;2009, Vol. 10, Special section p1 

    Background: Identifying recent positive selection signatures in domesticated animals could provide information on genome response to strong directional selection from domestication and artificial selection. With the completion of the cattle genome, private companies are now providing large...

  • In brief.  // Farmers Weekly;10/13/2006, Vol. 145 Issue 15, p136 

    The article presents an update on livestock sale in England as of October 2006. The Manorside Holstein herd was dispersed for Cheshire-based Rake Lane Farmers. Breeders Malcolm and Linda Wilson consigned the most sought-after Aberdeen Angus cattle at an auction in Carlisle. A shearling ewe from...

  • Galloway leads charge for Scottish exhibitors. Long, Jonathan // Farmers Weekly;7/3/2009, p33 

    The article discusses the highlights of the Royal Highland Show held in Ingliston, Scotland in June 2009. The Galloway bull Blackcraig Kodiac took the supreme interbreed award in the beef category while the Aberdeen Angus champion Balamachie Keystone stood as reserve. The Holstein champion...

  • Genome-wide detection of copy number variations using high-density SNP genotyping platforms in Holsteins. Li Jiang; Jicai Jiang; Jie Yang; Xuan Liu; Jiying Wang; Haifei Wang; Xiangdong Ding; Jianfeng Liu; Qin Zhang // BMC Genomics;2013, Vol. 14 Issue 1, p1 

    Background: Copy number variations (CNVs) are widespread in the human or animal genome and are a significant source of genetic variation, which has been demonstrated to play an important role in phenotypic diversity. Advances in technology have allowed for identification of a large number of...

  • In brief.  // Farmers Weekly;5/19/2006, Vol. 144 Issue 20, p148 

    The article presents an update on cattle auctions in England. Moynton herd manager Mike Yeandle sold a strong entry of cattle for owner Peter Olds in Dorchester. Dairy farmer Mike Slatter of Tewkesbury dispersed his Bushley Holstein herd and a calved heifer at 1100 guinea francs. The cattle...

  • Dymock, north GloucesterCattle eating our winter hay. Westaway, Paul // Farmers Weekly;8/21/2015, Issue 1041, p1 

    The article offers information on the Gloucester County Council farm run by farmers Paul Westaway and his wife Kirsty in partnership having Angus and Holstein cattle. Topics discussed include start of an online steak and wine shop, Paul praising his friends for volunteering in milk price cuts...

  • Whole genome resequencing of Black Angus and Holstein cattle for SNP and CNV discovery.  // BMC Genomics;2011 Supplement 2, Vol. 12 Issue Suppl 2, p559 

    Background: One of the goals of livestock genomics research is to identify the genetic differences responsible for variation in phenotypic traits, particularly those of economic importance. Characterizing the genetic variation in livestock species is an important step towards linking genes or...

  • Two-stage genome-wide association study identifies integrin beta 5 as having potential role in bull fertility. Feugang, Jean M.; Kaya, Abdullah; Page, Grier P.; Lang Chen; Mehta, Tapan; Hirani, Kashif; Nazareth, Lynne; Topper, Einko; Gibbs, Richard; Memili, Erdogan // BMC Genomics;2009, Vol. 10, Special section p1 

    Background: Fertility is one of the most critical factors controlling biological and financial performance of animal production systems and genetic improvement of lines. The objective of this study was to identify molecular defects in the sperm that are responsible for uncompensable fertility in...

  • Genome-Wide Estimates of Coancestry, Inbreeding and Effective Population Size in the Spanish Holstein Population. Rodríguez-Ramilo, Silvia Teresa; Fernández, Jesús; Toro, Miguel Angel; Hernández, Delfino; Villanueva, Beatriz // PLoS ONE;Apr2015, Vol. 10 Issue 4, p1 

    Estimates of effective population size in the Holstein cattle breed have usually been low despite the large number of animals that constitute this breed. Effective population size is inversely related to the rates at which coancestry and inbreeding increase and these rates have been high as a...


Read the Article


Sorry, but this item is not currently available from your library.

Try another library?
Sign out of this library

Other Topics