# Multiple Linear Regression Analysis For Market Price Of Homes

• December 28, 2023/

## Scatter Plot

The paper discusses the change of the Market price (Y) of houses due to changes in the Price index (X1), Annual % change (X2), Total number of square meters (X3), and Age of house (years) (X4) with the help of building regression model. Y is the dependent variable and the rest of the four variables are the predictor variables. The multiple linear regression model is fitted using the Data Analysis Tool-Pak of MS Excel software tool (SlezÃ et al., 2014). There are 15 observations from the financial years 2002-03 to 2016-17. The output of the regression analysis needs to be discussed in detail with the model estimation, building confidence interval, and elaborate interpretation.

Save Time On Research and Writing
Hire a Pro to Write You a 100% Plagiarism-Free Paper.

Analysis for Sydney City

• Graphical representations

The scatter plot is shown below:-

The dependent variable Market price is plotted on the vertical axis and the rest of the predictor variables are plotted against the horizontal axis. The scatter plot shows a positive relationships between the Price index and Market price of Sydney and Market price and Total number of square meters but the relationship is weaker for the latter. The Annual % change and Age of house are showing negative association with the Market price (Tang & Zhang, 2013).

• Description of the regression model:

Let the multiple linear regression model be defined as,

Save Time On Research and Writing
Hire a Pro to Write You a 100% Plagiarism-Free Paper.

Y = b0 + b1X1 + b2 X2 + b3X3 + b4X4 + e ;

b0 = the y-intercept

b1 = the partial regression coefficient of X1

b2 = the partial regression coefficient of X2

b3 = the partial regression coefficient of X3

b4 = the partial regression coefficient of X4

e = residual of estimation

The output of the regression analysis is shown in the table below:

 Regression Statistics Multiple R 0.88916481 R Square 0.79061406 Adjusted R Square 0.70685968 Standard Error 43.8878261 Observations 15 ANOVA df SS MS F Significance F Regression 4 72728.5872 18182.14679 9.43967451 0.001993481 Residual 10 19261.4128 1926.141284 Total 14 91990
 Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Intercept 548.978108 81.13153739 6.766519231 4.94032E-05 368.2057774 729.7504386 Sydney price Index 1.963493894 0.583205471 3.366727492 0.007160758 0.664031125 3.262956664 Annual % change -5.622204236 3.240109357 -1.735189655 0.113361729 -12.84161778 1.597209306 Total number of square meters 0.519145629 0.3239088 1.60275247 0.140071458 -0.202568152 1.240859409 Age of house (years) -2.48786597 1.129750872 -2.2021368 0.052251738 -5.005107781 0.029375841

Table 1

(Source: As created by the Author)

(4)

Model

The estimated regression equation is given from Table 2 as,

The equation shows that there is a linear relationship of Market price with the four predictor variables (Cameron & Trivedi, 2013). The Sydney price index and the Age of house have negative association whereas, the rest of the two have positive association with the Market price.

• Interpretation of the Coefficients

Here,

The y-intercept indicates that estimated Market price = 548.978 if all the independent variables are zero. The slope coefficient  states that one unit increase in the Sydney price index will increase the market price by 1.963 units, keeping other variables constant. means that a one unit increase in the Annual % change will decrease the Market price 5.622 units. The value of  indicates 0.519 unit increase of the Market price for single unit increase of X3 variable, keeping X1, X2, and X4 constant. One unit increase in the X4 variable will decrease the Y value by  if other variables are constant (Hinton, 2014).

• R-square

The value of the coefficient of determination of R2 is 0.790614058 which indicates a good fit as higher the value of R2 interprets better fitting of the regression model.

Here, 79.06% of the variability of the Market price is explained by the independent variables.

• Confidence intervals (CI)

The 95% CI for the regression coefficient of each of each of the independent variables can be evaluated using the formula (, i = 1 to 4.  standard error, and  = critical value of the t-statistic at 5% significance level (Zou, 2013). The values are already obtained in the Excel sheet.

## Regression Model

95% CI for X1 = (0.664031125, 3.262956664); It represents that the researcher is 95% confident that the Price index value of Sydney will lie between 0.664031125 and 3.262956664.

95% CI for X2 = (-12.84161778, 1.597209306); Here the value of Annual % change changes between the lower bound = 0.664031125 and upper bound = 3.262956664 with 95% confidence.

95% CI for X3 = (-0.202568152, 1.240859409); the values of variable X3 will lie within the given interval with 95% confidence having upper bound 1.24085 and lower bound -0.020257.

95% CI for X4 = (-5.005107781, 0.029375841); Like the above three, the value of X4 variable will lie within the calculated CI with 95% confidence level.

The simple linear regression model is shown below:

The regression analysis shows that the R2 value = 0.098 that implies that model is not a good fit. The estimated regression equation is-

= 659.143 + 0.5636

The R2 value of the former (original) regression model is 0.790614058 which is higher than that of the re-estimated model (0.098125419). Thus the, the multiple linear regression model is a better fit for the Market price than the simple linear regression model. Greater percentage (79.06%) of variability in the predicting variable is explained by all the four predictor variables than that of the variability explained by only “Total number of square meters” (9.81%).

If Square meters = 400 then the estimated Market price value = 659.143 + (0.5636  400) = 884.583 that is, \$884583.

This part shows the output of the regression analysis for the Market price of the Brisbane:

 Regression Statistics Multiple R 0.847362 R Square 0.7180224 Adjusted R Square 0.6052313 Standard Error 12.573407 Observations 15 ANOVA df SS MS F Significance F Regression 4 4025.588 1006.397 6.365952 0.008182698 Residual 10 1580.906 158.0906 Total 14 5606.493 Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Intercept 89.871245 16.91237 5.313936 0.000341 52.18814276 127.5543 Brisbane Price Index -0.5082788 0.47905 -1.06101 0.313636 -1.575668349 0.559111 Annual % change 1.4620069 1.016153 1.438766 0.180771 -0.802123664 3.726137 Total number of square meters 0.056465 0.094729 0.59607 0.564376 -0.154603966 0.267534 Age of house (years) -0.7797538 0.348462 -2.2377 0.049196 -1.556176484 -0.00333

Table 2

(Source: As created by the Author)

The estimated regression equation is given from the above table is written as,

The equation shows that there is negative linear relationship of the Market price with the Brisbane price index and Age of house. However, the relationship is positive for Total number of square meters and Annual % change (Cohen, West & Aiken, 2014).

The y-intercept indicates that estimated Market price = 89.871 if all the independent variables are zero. The Market price increases 1.462 units for unit increase in the X2 variable keeping other variables constant. Similar changes occur for the X3 variable. The 0.508 unit decrement in the Market price occurs for the one unit increment of the Brisbane price index. The regression coefficient being negative, a one unit increase in the X4 variable denotes 0.78 unit decrease in the Y variable.

The value of the coefficient of determination of R2 is 0.718 which suggests that the regression model is a good fit for the given predictor variable. Approximately, 71.8% of the variation in the value of the Market price variable is explained by the four dependent variables.

95% CI for X1 = (-1.575668349, 0.559110736); With 95% confidence, the Brisbane Price index value will lie between -1.575668349 and 0.559110736.

95% CI for X2 = (-0.802123664, 3.726137411); Here the value of Annual % change changes between the lower-bound = -0.802123664 and upper-bound = 3.726137411 with 95% confidence.

95% CI for X3 = (-0.154603966, 0.267533947); the values of variable X3 will lie within the given interval with 95% confidence.

## Estimated Regression Equation

95% CI for X4 = (-1.556176484, -0.003331143) that shows the upper-bound and lower-bound of the X4 variable within which the values will lie with 95% confidence.

The simple linear regression model is shown below that has R2 value 0.097 which is small, indicating a very poor fit. Here the fit is explaining 9.73% variation in the Market price explained by Total number of square meters.

The regression model is defined as,

= 65.988 + 0.139

The R2 value of the former (original) regression model is 0.718022376 which is higher than that of the re-estimated model (0.097334455). Thus, the multiple linear regression model is a better fit for the Market price than the simple linear regression model.

If Square meters = 400 then the estimated Market price value = 65.988 + (0.139  400) = 121.588 or \$121588.

 Regression Statistics Multiple R 0.801295 R Square 0.642074 Adjusted R Square 0.498904 Standard Error 19.79055 Observations 15 ANOVA df SS MS F Significance F Regression 4 7025.999 1756.5 4.484691 0.0247334 Residual 10 3916.658 391.6658 Total 14 10942.66 Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Intercept 82.70826 26.59056 3.110437 0.011052 23.460804 141.9557 Melbourne Price Index -0.67601 0.927016 -0.72923 0.482589 -2.741533 1.389509 Annual % change 1.851263 1.501679 1.232796 0.245847 -1.494686 5.197213 Total number of square meters 0.116315 0.145828 0.79762 0.443618 -0.20861 0.44124 Age of house (years) -1.24947 0.451735 -2.76594 0.019926 -2.255999 -0.24294

Table 3

(Source: As created by the Author)

The estimated regression equation is given from the above table is written as,

The equation shows that there is negative linear relationship of the Market price with the Melbourne price index and Age of house. However, the relationship is positive for Total number of sq-meters and Annual % change.

Here,

The y-intercept indicates that estimated Market price = 82.708 if all Xi’s are zero. The Market price decreases 0.676 units for unit increase in the X1 variable keeping other variables constant. The decrement of 1.24974 units of the Market price occurs for a single unit increase in the X4 variable. One unit increase of the X2 variable indicates an increase of 1.851 units and one unit increase of the X3 variable interprets 0.116 units of the Y variable (Bates et al., 2014).

The value of the coefficient of determination of R2 is 0.642074 which suggests a moderately good fit of the regression model for the given predictor variable. 64.21% of the variation in the value of the Market price variable is explained by the four dependent variables (Nimon & Oswald, 2013).

95% CI for X1 = (-2.741533069, 1.389508577); With 95% confidence, the Brisbane Price index value will lie between -2.741533069 and 1.389508577.

95% CI for X2 = (-1.494686069, 5.19721276); Here the value of Annual % change changes between the lower-bound = -1.494686069 and upper-bound = 5.19721276 with 95% confidence.

95% CI for X3 = (-0.208609648, 0.441240343); the values of variable X3 will lie within the given interval with 95% confidence.

95% CI for X4 = (-2.255998646, -0.242941926) that shows the upper-bound and lower-bound of the X4 variable within which the values will lie with 95% confidence.

The simple linear regression model is shown below that has R2 value 0.108.

The regression model is defined as,

= 50.092 + 0.204

The R2 value of the former (original) regression model is 0.642074 which is higher than that of the re-estimated model 0.108085. Thus, the multiple linear regression model is a better fit for the Market price than the simple linear regression model.

If Square meters = 400 then the estimated Market price value = 50.09162 + (0.204012  400) = 131.69642 or \$131696.

Conclusion

From the above multiple linear regression analyses of the Market price of house for three cities in Australia- Sydney, Brisbane, and Melbourne, it can be concluded that the four chosen independent variables are providing a good fit of the regression model for the dependent variable Market price for all the three cities as the values of the coefficients of determination are more than 0.5. On the other hand, the re-estimated simple linear regression model where the predictor variable is only the Total number of square meters does not provide a good explanation for the variability in the Market price variable. Therefore, the multiple linear regression model is the recommended model to predict the value of the market price variable.

References

Bates, D., Maechler, M., Bolker, B., & Walker, S. (2014). lme4: Linear mixed-effects models using Eigen and S4. R package version, 1(7), 1-23.

Cameron, A. C., & Trivedi, P. K. (2013). Regression analysis of count data (Vol. 53). Cambridge university press.

Cohen, P., West, S. G., & Aiken, L. S. (2014). Applied multiple regression/correlation analysis for the behavioral sciences. Psychology Press.

Hinton, P. R. (2014). Statistics explained. Routledge.

Nimon, K. F., & Oswald, F. L. (2013). Understanding the results of multiple linear regression: Beyond standardized regression coefficients. Organizational Research Methods, 16(4), 650-674.

SlezÃ, P., Bokes, P., Pavol, N. Ã., & WaczulÃkovÃ, I. (2014). Microsoft Excel add-in for the statistical analysis of contingency tables. International Journal for Innovation Education and Research, 2(5), 90-100.

Tang, Q. Y., & Zhang, C. X. (2013). Data Processing System (DPS) software with experimental design, statistical analysis and data mining developed for use in entomological research. Insect Science, 20(2), 254-260.

Zou, G. Y. (2013). Confidence interval estimation for the Bland–Altman limits of agreement with multiple observations per individual. Statistical methods in medical research, 22(6), 630-642.