Inference for marginal linear models for clustered longitudinal data with potentially informative cluster sizes

Ming Wang; Maiying Kong; Datta, Somnath
August 2011
Statistical Methods in Medical Research;Aug2011, Vol. 20 Issue 4, p347
Academic Journal
Clustered longitudinal data are often collected as repeated measures on subjects arising in clusters. Examples include periodontal disease study, where the measurements related to the disease status of each tooth are collected over time for each patient, which can be considered as a cluster. For such applications, the number of teeth for each patient may be related to the overall oral health of the individual and hence may influence the distribution of the outcome measure of interest leading to an informative cluster size. Under such situations, generalised estimating equations (GEE) may lead to invalid inferences. In this article, we investigate the performance of three competing proposals of fitting marginal linear models to clustered longitudinal data, namely, GEE, within-cluster resampling (WCR) and cluster-weighted generalised estimating equations (CWGEE). We show by simulations and theoretical calculations that, when the cluster size is informative, GEE provides biased estimators, while both WCR and CWGEE achieve unbiasedness under a variety of ‘working’ correlation structures for temporal measurements within each subject. Statistical properties of confidence intervals have been investigated using the probability—probability plots. Overall, CWGEE appears to be the recommended choice for marginal parametric inference with clustered longitudinal data that achieves similar parameter estimates and test statistics as WCR while avoiding Monte Carlo computation. The corresponding Wald tests have desirable power properties as well. We illustrate our analysis using a temporal data set on periodontal disease, which clearly demonstrates the need for CWGEE over GEE.


Related Articles

  • Statistical selection of perimeter-area models for patch mosaics in multiscale landscape analysis. Grossi, L.; Patil, G. P.; Taillie, C. // Environmental & Ecological Statistics;Jun2004, Vol. 11 Issue 2, p165 

    This paper presents a statistical method for detecting distinct scales of pattern for mosaics of irregular patches, by means of perimeter–area relationships. Krummel et al. (1987) were the first to develop a method for detecting different scaling domains in a landscape of irregular...

  • Forecasting for quantile self-exciting threshold autoregressive time series models. YUZHI CAI // Biometrika;Mar2010, Vol. 97 Issue 1, p199 

    Self-exciting threshold autoregressive time series models have been used extensively, and the conditional mean obtained from these models can be used to predict the future value of a random variable. In this paper we consider quantile forecasts of a time series based on the quantile...

  • An infinite swapping approach to the rare-event sampling problem. Plattner, Nuria; Doll, J. D.; Dupuis, Paul; Wang, Hui; Liu, Yufei; Gubernatis, J. E. // Journal of Chemical Physics;10/7/2011, Vol. 135 Issue 13, p134111 

    We describe a new approach to the rare-event Monte Carlo sampling problem. This technique utilizes a symmetrization strategy to create probability distributions that are more highly connected and, thus, more easily sampled than their original, potentially sparse counterparts. After discussing...

  • Validation tests of an improved kernel density estimation method for identifying disease clusters. Cai, Qiang; Rushton, Gerard; Bhaduri, Budhendra // Journal of Geographical Systems;Jul2012, Vol. 14 Issue 3, p243 

    The spatial filter method, which belongs to the class of kernel density estimation methods, has been used to make morbidity and mortality maps in several recent studies. We propose improvements in the method to include spatially adaptive filters to achieve constant standard error of the relative...

  • Robust estimation by expectation maximization algorithm. Koch, Karl // Journal of Geodesy;Feb2013, Vol. 87 Issue 2, p107 

    A mixture of normal distributions is assumed for the observations of a linear model. The first component of the mixture represents the measurements without gross errors, while each of the remaining components gives the distribution for an outlier. Missing data are introduced to deliver the...

  • Partially linear models with autoregressive scale-mixtures of normal errors: A Bayesian approach. Ferreira, Guillermo; Castro, Mauricio; Lachos, Victor H. // AIP Conference Proceedings;Oct2012, Vol. 1490 Issue 1, p116 

    Normality and independence of error terms is a typical assumption for partial linear models. However, such an assumption may be unrealistic on many fields such as economics, finance and biostatistics. In this paper, we develop a Bayesian analysis for partial linear model with first-order...

  • An analysis of error structure in modeling the stock–recruitment data of gadoid stocks using generalized linear models. Yan Jiao; Schnedier, David; Chen, Yong; Wroblewski, Joe // Canadian Journal of Fisheries & Aquatic Sciences;Jan2004, Vol. 61 Issue 1, p134 

    When modeling the stock–recruitment (S–R) relationship, the Cushing, Ricker, and other S–R models are fitted to the observed S–R data by estimating parameters with assumptions made concerning the model error structure. Using a generalized linear model approach, we...

  • Inferring synaptic inputs given a noisy voltage trace via sequential Monte Carlo methods. Paninski, Liam; Vidne, Michael; DePasquale, Brian; Ferreira, Daniel // Journal of Computational Neuroscience;Aug2012, Vol. 33 Issue 1, p1 

    We discuss methods for optimally inferring the synaptic inputs to an electrotonically compact neuron, given intracellular voltage-clamp or current-clamp recordings from the postsynaptic cell. These methods are based on sequential Monte Carlo techniques ('particle filtering'). We demonstrate, on...

  • Outcome-Driven Cluster Analysis with Application to Microarray Data. Hsu, Jessie J.; Finkelstein, Dianne M.; Schoenfeld, David A. // PLoS ONE;11/12/2015, Vol. 10 Issue 10, p1 

    One goal of cluster analysis is to sort characteristics into groups (clusters) so that those in the same group are more highly correlated to each other than they are to those in other groups. An example is the search for groups of genes whose expression of RNA is correlated in a population of...


Read the Article


Sorry, but this item is not currently available from your library.

Try another library?
Sign out of this library

Other Topics