Mining Skeletal Phenotype Descriptions from Scientific Literature

Groza, Tudor; Hunter, Jane; Zankl, Andreas
February 2013
PLoS ONE;Feb2013, Vol. 8 Issue 2, p1
Academic Journal
Phenotype descriptions are important for our understanding of genetics, as they enable the computation and analysis of a varied range of issues related to the genetic and developmental bases of correlated characters. The literature contains a wealth of such phenotype descriptions, usually reported as free-text entries, similar to typical clinical summaries. In this paper, we focus on creating and making available an annotated corpus of skeletal phenotype descriptions. In addition, we present and evaluate a hybrid Machine Learning approach for mining phenotype descriptions from free text. Our hybrid approach uses an ensemble of four classifiers and experiments with several aggregation techniques. The best scoring technique achieves an F-1 score of 71.52%, which is close to the state-of-the-art in other domains, where training data exists in abundance. Finally, we discuss the influence of the features chosen for the model on the overall performance of the method.


Related Articles

  • Muscle Tissue Changes with Aging. PEREIRA, Ana Fátima; SILVA, António José; COSTA, Aldo MATOS; MONTEIRO, António Miguel; BASTOS, Estela Maria; MARQUES, Mário CARDOSO // Acta Medica Portuguesa;ene/fev2013, Vol. 26 Issue 1, p51 

    Sarcopenia is characterized by a progressive generalized decrease of skeletal muscle mass, strength and function with aging. Recently, the genetic determination has been associated with muscle mass and muscle strength in elderly. These two phenotypes of risk are the most commonly recognized and...

  • Supervised segmentation of phenotype descriptions for the human skeletal phenome using hybrid methods. Groza, Tudor; Hunter, Jane; Zankl, Andreas // BMC Bioinformatics;2012, Vol. 13 Issue 1, p265 

    Background: Over the course of the last few years there has been a significant amount of research performed on ontology-based formalization of phenotype descriptions. In order to fully capture the intrinsic value and knowledge expressed within them, we need to take advantage of their inner...

  • Functions of miR-1 and miR-133a during the postnatal development of masseter and gastrocnemius muscles. Nariyama, Megumi; Mori, Manami; Shimazaki, Emi; Asada, Yoshinobu; Ando, Hitoshi; Abo, Tokuhisa; Yamane, Akira; Ohnuki, Yoshiki // Molecular & Cellular Biochemistry;Sep2015, Vol. 407 Issue 1/2, p17 

    The present study investigated the function of miR-1 and miR-133a during the postnatal development of mouse skeletal muscles. The amounts of miR-1 and miR-133a were measured in mouse masseter and gastrocnemius muscles between 1 and 12 weeks after birth with real-time polymerase chain reaction...

  • Stac3 Is a Novel Regulator of Skeletal Muscle Development in Mice. Reinholt, Brad M.; Ge, Xiaomei; Cong, Xiaofei; Gerrard, David E.; Jiang, Honglin // PLoS ONE;Apr2013, Vol. 8 Issue 4, p1 

    : The goal of this study was to identify novel factors that mediate skeletal muscle development or function. We began the study by searching the gene expression databases for genes that have no known functions but are preferentially expressed in skeletal muscle. This search led to the...

  • Correction: Differences in Muscle Transcriptome among Pigs Phenotypically Extreme for Fatty Acid Composition.  // PLoS ONE;Jul2014, Vol. 9 Issue 7, p1 

    No abstract available.

  • Identification of Common Regulators of Genes in Co-Expression Networks Affecting Muscle and Meat Properties. Ponsuksili, Siriluck; Siengdee, Puntita; Du, Yang; Trakooljul, Nares; Murani, Eduard; Schwerin, Manfred; Wimmers, Klaus // PLoS ONE;Apr2015, Vol. 10 Issue 4, p1 

    Understanding the genetic contributions behind skeletal muscle composition and metabolism is of great interest in medicine and agriculture. Attempts to dissect these complex traits combine genome-wide genotyping, expression data analyses and network analyses. Weighted gene co-expression network...

  • Combined association and aggregation analysis of data from case-control family studies. ZHAO, LUE PING; HSU, LI; HOLTE, SARAH; CHEN, YAN; QUIAOIT, FILEMON; PRENTICE, ROSS L. // Biometrika;1998, Vol. 85 Issue 2, p299 

    Genetic epidemiologists are increasingly interested in family studies using the casecontrol family study design, and this has motivated several recent developments in related statistical methodology. By summarising and extending some of these developments, this paper proposes an...

  • Integrating Evolution and Development: The Need for Bioinformatics in Evo-Devo. Mabee, Paula M. // BioScience;Apr2006, Vol. 56 Issue 4, p301 

    This article is an overview of concepts relating to the integration of the genotype and phenotype. One of the major goals of evolutionary developmental biology, or evo-devo, is to understand the transformation of morphology in evolution. This goal can be accomplished by synthesizing the data...

  • The problem of variation. Stern, David L. // Nature;11/30/2000, Vol. 408 Issue 6812, p529 

    Reports on research that identifies a key genetic regulator of sex-specific differences in abdominal pigmentation in the fruitfly Drosophila melanogaster. Mention of a study presented in the November 30, 2000 issue of 'Nature' magazine; Statement that the study illustrates how developmental...


Read the Article


Sorry, but this item is not currently available from your library.

Try another library?
Sign out of this library

Other Topics