What is stepAIC in R?

In R, stepAIC is one of the most commonly used search method for feature selection. We try to keep on minimizing the stepAIC value to come up with the final set of features. “stepAIC” does not necessarily means to improve the model performance, however it is used to simplify the model without impacting much on the performance. So AIC quantifies the amount of information loss due to this simplification. AIC stands for Akaike Information Criteria.

If we are given two models then we will prefer the model with lower AIC value. Hence we can say that AIC provides a means for model selection. AIC is only a relative measures among multiple models.

AIC is similar adjusted R-squared as it also penalizes for adding more variables to the model. absolute value of AIC does not have any significance. We only compare AIC value whether it is increasing or decreasing by adding more variables. Also in case of multiple models, the one which has lower AIC value is preferred.

So lets see how stepAIC works in R. We will use the mtcars data set. First remove the feature “x” by setting it to null as it contains only car models name which does not carry much meaning in this case. Also then remove the rows which contains null values in any of the columns using na.omit function. It is required to handle null values otherwise stepAIC method will give error. Then build the model and run stepAIC. for this we need MASS and CAR packages.

First parameter in stepAIC is the model output and second parameter is direction means which feature selection techniques we want to use and it can take the following values:

  • “both” (for stepwise regression, both forward and backward selection);
  • “backward” (for backward selection) and
  • “forward” (for forward selection).

At the very last step stepAIC has produced the optimal set of features {drat, wt, gear, carb}. stepAIC also removes the Multicollinearity if it exists, from the model which I will explain in the next coming article.

If you are an aspiring data scientist or an experienced professional who is trying to make his career in Data Science, then you must visit E-network. Where we focus on high-quality interactive mock interview sessions and help you to QuickStart your Data Science and Machine Learning journey by Preparing a learning roadmap, providing study material, suggesting Best training institutes and provide practice problems with their solutions and many more…

Feel free to contact us for more details and discussions.

Recommended Articles:


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.