Genetic Algorithm Based Outlier Detection Using Bayesian Information Criterion in Multiple Regression Models Having Multicollinearity Problems

Alma, Özlem Gürünlü; Kurt, Serdar; Uğur, Aybars
July 2009
Gazi University Journal of Science;Jul2009, Vol. 22 Issue 3, p141
Academic Journal
Multiple linear regression models are widely used applied statistical techniques and they are most useful devices for extracting and understanding the essential features of datasets. However, in multiple linear regression models problems arise when a serious outlier observation or multicollinearity present in the data. In regression however, the situation is somewhat more complex in the sense that some outlying points will have more influence on the regression than others. An important problem with outliers is that they can strongly influence the estimated model, especially when using least squares method. Nevertheless, outlier data are often the special points of interests in many practical situations. Another problem is multicollinearity in multiple linear regression (MLR) models, defined as linear dependencies among the independent variables. The purpose of this study is to define multicollinearity and outlier detection method using a Genetic Algorithm (GA) and Bayesian Information Criterion (BIC) in multiple regression models. Also, GA with BIC is to illustrate the algorithm with real and simulation data for outlier detection in MLR models having multicollinearity problems.


Related Articles

  • A partial least squares solution to the problem of multicollinearity when predicting the high temperature properties of 1Cr-1Mo-0.25V steel using parametric models. Evans, Mark // Journal of Materials Science;Mar2012, Vol. 47 Issue 6, p2712 

    Recently there has been renewed interest in assessing the predictive accuracy of existing parametric models of creep properties, with the recently develop Wilshire methodology being largely responsible for this revival. Without exception, these studies have used multiple linear regression...

  • The Comparison Between Several Robust Ridge Regression Estimators in the Presence of Multicollinearity and Multiple Outliers. Zahari, Siti Meriam; Ramli, Norazan Mohamed; Moktar, Balkiah; Zainol, Mohammad Said // AIP Conference Proceedings;2014, Vol. 1613, p388 

    In the presence of multicollinearity and multiple outliers, statistical inference of linear regression model using ordinary least squares (OLS) estimators would be severely affected and produces misleading results. To overcome this, many approaches have been investigated. These include robust...

  • A Survey of Ridge Regression for Improvement Over Ordinary Least Squares. Singh, Rajeshwar // IUP Journal of Computational Mathematics;Dec2010, Vol. 3 Issue 4, p54 

    Multicollinearity may be a possible cause in case of study with two or more explanatory variables. In the presence of multicollinearity, the design matrix becomes nearly singular and hence X and the corresponding Xï‚¢X are not of full rank. In this situation, the Ordinary Least Square (OLS)...

  • Inequality Constraints, Multicollinearity and Models of Police Expenditure. Buck, Andrew J.; Hakim, Simon // Southern Economic Journal;Oct81, Vol. 48 Issue 2, p449 

    The present paper generalizes the Lovell-Prescott constraint to a linear combination of the location parameters restricted to an interval. We will derive the mean and variance for the constrained estimator, and demonstrate the behavior of squared error risk for a particular example.
    In this...

  • Global richness patterns of venomous snakes reveal contrasting influences of ecology and history in two different clades. Terribile, Levi Carina; Olalla-Tárraga, Miguel Ángel; Morales-Castilla, Ignacio; Rueda, Marta; Vidanes, Rosa M.; Rodríguez, Miguel Ángel; Felizola Diniz-Filho, José Alexandre // Oecologia;Mar2009, Vol. 159 Issue 3, p617 

    Recent studies addressing broad-scale species richness gradients have proposed two main primary drivers: contemporary climate and evolutionary processes (differential balance between speciation and extinction). Here, we analyze the global richness patterns of two venomous snake clades, Viperidae...

  • Classical Least Squares, Part Ill: Spectroscopic Theory. Mark, Howard; Workman, Jerome // Spectroscopy;Oct2010, Vol. 25 Issue 10, p22 

    The article focuses on the application of chemometrics in spectrum analysis. The mathematics behind the classical least squares (CLS) approach to analysis is examined. It explores how the same least squares calculations which are used for multiple linear regression (MLR) calibration can be...

  • Analyzing 2D gel images using a two-component empirical bayes model.  // BMC Bioinformatics;2011 Supplement 10, Vol. 12 Issue Suppl 10, p433 

    The article reports on a study that analyzes two dimensional (2D) gel images via a two-component empirical Bayes (EB) model. It states EB models have been widely discussed for large-scale hypothesis testing and applied in the context of genomic data. It proposes to estimate the mixture and null...

  • A fast and efficient Gibbs sampler for BayesB in whole-genome analyses. Hao Cheng; Long Qu; Garrick, Dorian J.; Fernando, Rohan L. // Genetics Selection Evolution;10/14/2015, Vol. 47, p1 

    Background: In whole-genome analyses, the number p of marker covariates is often much larger than the number n of observations. Bayesian multiple regression models are widely used in genomic selection to address this problem of p ≫ n. The primary difference between these models is the...

  • Multiple Regression and Quadrant Analysis. Bacon, Donald R. // Marketing Research;Spring2004, Vol. 16 Issue 1, p47 

    The article comments on how causal linkages among various aspects of a business can be determined through multiple regression and how quadrant charts can then be used to determine which aspects of a business are most important for improvement. The fundamental problem with using multiple...


Read the Article


Sorry, but this item is not currently available from your library.

Try another library?
Sign out of this library

Other Topics