Ridge-Regression in the Nutshell
Introduction
Ridge Regression is a regular form of linear regression, namely by adding a regularization term of
into the cost function. The cost function or commonly known as the loss function is a function used to find parameters in regression. This parameter can be taken by minimizing it so that the parameter θ is obtained. This is used as a statistical basis for the Ridge Regression method. The Ridge Regression Cost Equation is as follows
where:
Using a technique similar to what is done in Linear Regression, the Normal Equation (or what is commonly known as the Sealed Ridge Regression Solution) is as follows:
where:
Deficiency
• Cannot reduce the coefficient to zero
• So that variable selection cannot be made
Advantages
• Can reduce variance by increasing the value of the bias,
• Can improve prediction performance
• Simple and computationally easy to complete
Required data conditions
• Suitable for data that has more predictor variables than the number of observed variables
Impact on business
• Similar to linear regression, the Ridge Regression model can be used to regress data
Potential for misuse
• The determination of the alpha value must be chosen carefully because if it is not suitable, overfit will occur.