IUP Publications Online
Home About IUP Magazines Journals Books Archives
     
Recommend    |    Subscriber Services    |    Feedback    |     Subscribe Online
 
The IUP Journal of Computational Mathematics
A Survey of Ridge Regression for Improvement Over Ordinary Least Squares
:
:
:
:
:
:
:
:
:
 
 
 
 
 
 
 

Multicollinearity may be a possible cause in case of study with two or more explanatory variables. In the presence of multicollinearity, the design matrix becomes nearly singular and hence X and the corresponding X'X are not of full rank. In this situation, the Ordinary Least Square (OLS) estimate cannot be obtained. Thus, adequate attention is required to be given on the presence of multicollinearity in the data. In this survey only ridge regression is discussed as a solution to the problem of multicollinearity. Hoerl and Kennard proposed the technique of ridge regression that has become a popular tool for data analysis faced with the problem of a high degree of multicollinearity.

 
 
 

When X is not of full rank, the determinant of X'X is zero and one or more of its eigenvalues are zeros. In this situation, Ordinary Least Square (OLS) estimate of β and its variance, theoretically, explode. On the contrary, when all columns of X are orthogonal, then X'X = I and the determinant of X'X is unity.The situation of perfect multicollinearity is almost as rare as that of perfect orthogonality.The departure of |X'X| from unity is called non-orthogonality, while its proximity to zero gives rise to multicollinearity.But this distinction has not been maintained in the literature. For convenience, ridge regression literature often ignores the distinction among multicollinearity, non-orthogonality and ill-conditioning.

Multicollinearity occurs when variables are highly correlated (0.90 and above but less than 1), and singularity occurs when the variables are perfectly correlated.In the presence of near multicollinearity or multicollinearity, the design matrix becomes nearly singular and hence X is not of full rank.In case of ill-conditioned X'X, some of its eigenvalues are close to zero and their reciprocals are very large. The expected squared length of OLS estimators vector is greater than that of the true parameter vector. One could refer Brook and Moore (1980) for a detailed discussion of this point.

Collecting more data or dropping one or more variables is the traditional solution. But collecting more data may often be expensive or not practicable in numerous situations. Dropping one or more variables from the model to alleviate the problem of multicollinearity may lead to the specification bias and hence the solution may be worse than the disease in certain situations. The interest has been to squeeze out maximum information from the data at disposal, and this has motivated the researchers to develop some very ingenious statistical methods, e.g., ridge regression, principal component regression, and generalized inverse regression. The application of these statistical methods solves the problem of multicollinearity successfully.

 
 
 

Computational Mathematics Journal, Ridge Regression, Ordinary Least Squares, Multicollinearity, Non-Orthogonality, Principal Component Regression, Statistical Methods, Multiple Linear Regression Model, Linear Transformation, Normal Regression Theory, Multiple Determination.