Tutorial on Multiple Regression using R Programming on mtcars Dataset

Multiple Regression is an extension of Linear Regression technique. In Linear Regression Value of one unknown variable is predicted with one known variable (Read: Tutorial on Linear Regression using R Programming). Extending the capability of Linear Regression, if the value of unknown variable if predicted using values of two or more unknown variable then it is called as Multiple Regression analysis.

Multiple Regression – Mathematical Formula

In multiple regression there are more than one predictor variable and one response variable, relation of the variables is shown below:

Y = a + b1x1 + b2x2 + ……… bnxn, Where Y is response, a…..bn are coefficients and x1…..xn are predictor variables.

Multiple Regression using R Programming

For this tutorial on Multiple Regression Analysis using R Programming, I am going to use mtcars dataset and we will see How the Model is built for two and three predictor variables.

Case Study 1: Establishing Relationship between “mpg” as response variable and “disp”, “hp” as predictor variables.

Step1: Load the required data

data <- mtcars[,c("mpg","disp","hp")]##From this command we are creating new data variable with all rows and only required columns
head(data)

Tutorial on Multiple Regression using R Programming on mtcars Dataset - 1

Step2: Build Model using lm() function

model <- lm(mpg~disp+hp, data=data)
summary(model)

Tutorial on Multiple Regression using R Programming on mtcars Dataset - 2

As you can see in the summary output shown above, we have got the intercept value which is the value of ‘a’ in the equation and coefficients of “disp” and “hp” are -0.030346 and -0.024840 respectively. Therefore the regression analysis equation will be:

mpg = 30.735904 + (-0.030346)disp + (-0.024840)hp

Using the above equation we can predict the value of mpg based on disp and hp.

Step3: Predicting the output.

predict(model, newdata = data.frame(disp=140, hp=80))

Predicted Output Mileage is 24.50022

If you enter the values of disp and hp in the equation derived above you will get the same output.

Plotting the Regression:

plot(model)

Output:
Tutorial on Multiple Regression using R Programming on mtcars Dataset-3

Case Study 2: Establishing Relationship between “mpg” as response variable and “disp”, “hp” and “wt” as predictor variables.

model1 <- lm(mpg~disp+hp+wt, data=mtcars)
summary(model1)

Tutorial on Multiple Regression using R Programming on mtcars Dataset  - 4

Equation will be like:
mpg = 37.105505 + (-0.000937)disp + (-0.031157)hp + (-3.800891)wt

predict(model1, newdata = data.frame(disp=160, hp=100, wt=2.5))

Predicted Output Mileage is 24.3377

How to Compare Different Model on Regression Analysis

To compare model, use anova() function.

anova(model, model1)

Tutorial on Multiple Regression using R Programming on mtcars Dataset  - 5

Complete Tutorial on Anova() function will be published in the coming Tutorials on r Programming for Data Science.

Try the codes on the dataset and post comment below for any query.

Leave a Reply