Effect sizes can be misleading: is it time to change the way we measure change?

Jeremy C Hobart
September 2010
Journal of Neurology, Neurosurgery & Psychiatry;Sep2010, Vol. 81 Issue 9, p1044
Academic Journal
OBJECTIVES: Previous comparisons of the ability to detect change in the Barthel Index (BI) and Functional Independence Measure motor scale (FIMm) have implied these two scales are equally responsive when examined using traditional effect size statistics. Clinically, this is counterintuitive as the FIMm has greater potential to detect change than the BI and raises concerns about the validity of effect size statistics as indicators of rating scale responsiveness. To examine these concerns, in this study a sophisticated psychometric analysis was applied, Rasch measurement to BI and FIMm data. METHODS: BI and FIMm data were examined from 976 people at a single neurorehabilitation unit. Rasch analysis was used to compare the responsiveness of the BI and FIMm at the group comparison level (effect sizes, relative efficiency, relative precision) and for each individual person in the sample by computing the significance of their change. RESULTS: Group level analyses from both interval measurements and ordinal scores implied the BI and FIMm had equivalent responsiveness (BI and FIMm effect size ranges –0.82 to –1.12 and –0.77 to –1.05, respectively). However, individual person level analyses indicated that the FIMm detected significant improvement in almost twice as many people as the BI (50%, n=496 vs 31%, n=298), and recorded less people as unchanged on discharge (FIMm=4%, n=38; BI=12%, n=115). This difference was found to be statistically significant (χ2=273.81; p<0.000). CONCLUSIONS: These findings demonstrate that effect size calculations are limited and potentially misleading indicators of rating scale responsiveness at the group comparison level. Rasch analysis at the individual person level showed the superior responsiveness of the FIMm, supporting clinical expectation, and its added value as a method for examining and comparing rating scale responsiveness.


Related Articles

  • Future of Psychometrics: Ask What Psychometrics Can Do for Psychology. Sijtsma, Klaas // Psychometrika;Jan2012, Vol. 77 Issue 1, p4 

    I address two issues that were inspired by my work on the Dutch Committee on Tests and Testing (COTAN). The first issue is the understanding of problems test constructors and researchers using tests have of psychometric knowledge. I argue that this understanding is important for a field, like...

  • Self-Report Instruments for Fatigue Assessment: A Systematic Review. Mota, Dálate D. C. F.; Pimenta, Cibele A. M. // Research & Theory for Nursing Practice;Spring2006, Vol. 20 Issue 1, p49 

    This systematic review analyzed 18 self-reported fatigue instruments for adults. Five databases were searched combining fatigue with instrument, questionnaire, inventory, scale, or assessment. Eighteen fatigue instruments and six definitions of fatigue were found. Six instruments apply to...

  • The Dimensions of Dangerousness. Menzies, Robert J.; Webster, Christopher D.; Sepejak, Diana S. // Law & Human Behavior (Springer Science & Business Media B.V.);Mar1985, Vol. 9 Issue 1, p49 

    Treatment of dangerousness in both sociolegal research and clinical practice has neglected to consider the multidimensional nature of the construct. An attempt was made to develop a psychometric instrument sensitive to several facets of dangerous behavior among forensic patients. Two trained...

  • PMK: Medidas Válidas para a Predição do Desempenho no Trabalho? Vasconcelos, Alina Gomide; dos Reis Sampaio, Jáder; Nascimento, Elizabeth // Psicologia: Reflexão e Critica;2013, Vol. 26 Issue 2, p251 

    Myokinetic Psychodiagnosis (MKP) has been considered a useful instrument by the Brazilian professional community, although there are few validity investigations about its psychometrics properties. The aim of this study was to investigate the predictive validity of MKP measures in relation to job...

  • Development and Validation of a Measure of Workplace Climate for Healthy Weight Maintenance. Sliter, Katherine A. // Journal of Occupational Health Psychology;Jul2013, Vol. 18 Issue 3, p350 

    The article discusses a study on the development and validation of a measure of workplace climate for healthy weight maintenance. The authors aimed to develop and validate a concise and psychometrically sound measure of climate for healthy weight. The study found that scores on the measure had a...

  • Constructing and approbation of methodology "Subjective quality of life" (SQL). Eksakusto, T. V.; Zaichenko, A. A. // Vestnik Sankt-Peterburgskogo universiteta, Seriia 7: Geologia, G;2012, Issue 4, p64 

    This paper presents a study of the subjective quality of life of subjects, which resulted in the development, construction and testing of psychodiagnostic techniques "Subjective quality of life" (SQL) of the individual. This article describes a theoretical model of quality of life of the...

  • Post-modern career assessment for traditionally disadvantaged South African learners: Moving away from the 'expert opinion'. BISCHOF, DAVID; ALEXANDER, DINAH // Perspectives in Education;Sep2008, Vol. 26 Issue 3, p7 

    The article examines the implications and applicability of traditional psychometric and narrative post-modern career assessment of learners from disadvantaged communities in the South African context. One study conducted suggested that South African career assessment practitioners who use...

  • A VALIDATIONAL STUDY OF AWARITEFE PSYCHOLOGICAL INDEX (API) -- FORUM X ON NIGERIAN ADOLESCENTS. Olutope, Akinnawo E.; Caroline, Ofovwe // IFE PsychologIA;2012, Vol. 20 Issue 2, p41 

    The article discusses a validational study of Awaritefe Psychological Index (API), one of the most widely used indigenous psychological assessment tools in Nigeria developed by Alfred Awaritefe. The index was developed on the basis of the African peculiarities, particularly the socio-cultural...

  • PRELIMINARY REPORT: PSYCHOLOGICAL ASSESSMENT OF GREEK WOMEN WITH DIABETES DURING PREGNANCY. Ilias, Ioannis; Papageorgiou, Charalambos; Katsadoros, Kyriakos; Zapanti, Evangelia; Anastasiou, Eleni // Perceptual & Motor Skills;Oct2005, Vol. 101 Issue 2, p628 

    23 Greek pregnant women with type 1 diabetes had a higher mean score on the Maudsley Obsessive-Compulsive Inventory in the second and third trimesters of pregnancy than 13 women with gestational diabetes. Long-term changes in the lifestyle of the former may apparently lead to this higher mean....


Read the Article


Sorry, but this item is not currently available from your library.

Try another library?
Sign out of this library

Other Topics