Multiple Regression is an extension of Linear Regression technique. In Linear Regression Value of one unknown variable is predicted with one known variable (Read: Tutorial on Linear Regression using R Programming). Extending the capability of Linear Regression, if the value of unknown variable if predicted using values of two or more unknown variable then it is called as Multiple Regression analysis.
Multiple Regression – Mathematical Formula
In multiple regression there are more than one predictor variable and one response variable, relation of the variables is shown below:
Y = a + b1x1 + b2x2 + ……… bnxn, Where Y is response, a…..bn are coefficients and x1…..xn are predictor variables.
Multiple Regression using R Programming
For this tutorial on Multiple Regression Analysis using R Programming, I am going to use mtcars dataset and we will see How the Model is built for two and three predictor variables.
Case Study 1: Establishing Relationship between “mpg” as response variable and “disp”, “hp” as predictor variables.
Step1: Load the required data
data <- mtcars[,c("mpg","disp","hp")]##From this command we are creating new data variable with all rows and only required columns head(data)
Step2: Build Model using lm() function
model <- lm(mpg~disp+hp, data=data) summary(model)
As you can see in the summary output shown above, we have got the intercept value which is the value of ‘a’ in the equation and coefficients of “disp” and “hp” are -0.030346 and -0.024840 respectively. Therefore the regression analysis equation will be:
mpg = 30.735904 + (-0.030346)disp + (-0.024840)hp
Using the above equation we can predict the value of mpg based on disp and hp.
Step3: Predicting the output.
predict(model, newdata = data.frame(disp=140, hp=80))
Predicted Output Mileage is 24.50022
If you enter the values of disp and hp in the equation derived above you will get the same output.
Plotting the Regression:
plot(model)
Output:
Case Study 2: Establishing Relationship between “mpg” as response variable and “disp”, “hp” and “wt” as predictor variables.
model1 <- lm(mpg~disp+hp+wt, data=mtcars) summary(model1)
Equation will be like:
mpg = 37.105505 + (-0.000937)disp + (-0.031157)hp + (-3.800891)wt
predict(model1, newdata = data.frame(disp=160, hp=100, wt=2.5))
Predicted Output Mileage is 24.3377
How to Compare Different Model on Regression Analysis
To compare model, use anova() function.
anova(model, model1)
Complete Tutorial on Anova() function will be published in the coming Tutorials on r Programming for Data Science.
Try the codes on the dataset and post comment below for any query.