Regression Analysis For Market Value Of House

Scatter Plot

The key aim of the given report is to frame an appropriate regression model for the variables that have been presented. The total variable count for the given dataset is five and all these variables are in the form of quantitative data allowing the performing of regression analysis. As the data has been provided for each of the past 15 years, hence the sample size is 15.  The primary objective is to develop a multiple regression model where the market price would be the dependent variable while the remaining four variables would serve as independent variables. The measurement scale for the different variables is ratio or interval so as to facilitate the representation of these variables in the form of a multiple regression model. The various variables provided seem suitable for estimation of market value of house. Once the multiple regression model is developed, then suitable changes would be made to develop a more suitable model and to weed out the independent variables which do not have a significant relationship with the dependent variable.

Save Time On Research and Writing
Hire a Pro to Write You a 100% Plagiarism-Free Paper.
Get My Paper

Between every independent variable and the underlying dependent variable, scatter plot needs to be drawn which is carried out in this section.

Considering that the best fit line shown in the plot above has a positive slope, hence it can be concluded that the underlying linear relationship between the given two variables is positive. The deviation of the various scatter points from the line of best fit is also minimal which is indicative of the fact the underlying magnitude of the correlation between the variables is high. As a result, it would be fair to conclude that the given two variables (Sydney Price Index & Market price) have a positive and strong relationship in strength (Flick, 2015).

Considering that the best fit line shown in the plot above has a positive slope, hence it can be concluded that the underlying linear relationship between the given two variables is positive. The deviation of the various scatter points from the line of best fit is quite large which is indicative of the fact the underlying magnitude of the correlation between the variables is low to moderate. As a result, it would be fair to conclude that the given two variables (Annual % change & Market price) have a positive but weak to moderate relationship in strength (Eriksson and Kovalainen, 2015).

Considering that the best fit line shown in the plot above has a negative slope, hence it can be concluded that the underlying linear relationship between the given two variables is negative. The deviation of the various scatter points from the line of best fit is also not very large which is indicative of the fact the underlying magnitude of the correlation between the variables is moderately high. As a result, it would be fair to conclude that the given two variables (Age of house & Market price) have a negative and moderately strong relationship in strength (Hair, et al., 2015).

Save Time On Research and Writing
Hire a Pro to Write You a 100% Plagiarism-Free Paper.
Get My Paper

Coefficient interpretation and significance testing

Considering that the best fit line shown in the plot above has a positive slope, hence it can be concluded that the underlying linear relationship between the given two variables is positive. The deviation of the various scatter points from the line of best fit is moderately large which is indicative of the fact the underlying magnitude of the correlation between the variables is moderate only. As a result, it would be fair to conclude that the given two variables (Area of house & Market price) have a positive but moderate relationship in strength (Hillier, 2016).

Based on the regression equation indicated above, the intercept value is 548.98. The respective coefficients of the independent variables are the slope coefficients while the standard error for the model is 43.8878.

5) Coefficient interpretation and significance testing

The coefficients indicated in the multiple regression model can be interpreted as highlighted below.

Intercept – This particular coefficient indicates that house market value when the given independent variables all assume a value of zero which is ofcourse not practical.

Slope coefficient (Sydney Price Index) – The given independent variable has a slope coefficient of 1.96.  The interpretation of this coefficient is that when the given variable tends to alter by 1 unit, then the house market price would alter by $ 1,960. Considering the positive value of the coefficient, the movement of both the variables would be directed towards same direction (Fehr and Grossman, 2013).

Slope coefficient (Annual % change) – The given independent variable has a slope coefficient of -5.62.  The interpretation of this coefficient is that when the given variable tends to alter by 1 unit, then the house market price would alter by $ 5,620. Considering the negative value of the coefficient, the movement of both the variables would be directed towards opposite direction (Hastie, Tibshirani and Friedman, 2014).

Slope coefficient (House Area) – The given independent variable has a slope coefficient of 0.52.  The interpretation of this coefficient is that when the given variable tends to alter by 1 unit, then the house market price would alter by $ 520. Considering the positive value of the coefficient, the movement of both the variables would be directed towards same direction

Slope coefficient (House Age) – The given independent variable has a slope coefficient of -2.49.  The interpretation of this coefficient is that when the given variable tends to alter by 1 unit, then the house market price would alter by $ 2,490. Considering the negative value of the coefficient, the movement of both the variables would be directed towards opposite direction (Fehr and Grossman, 2013).

Sydney Price Index

The statistical significance of the slope coefficients has been tested below.

H0: βSydney Price Index = 0 i.e. the slope coefficient of the given variable is not significant and thereby can be taken as zero.

H1: βSydney Price Index ≠ 0 i.e. the slope coefficient of the given variable is significant and thereby cannot be taken as zero.

For the purpose of this hypothesis testing, the significance level is taken as 5%.

The hypothesis testing would be carried out based on the t statistic. Taking the multiple regression result into consideration, t statistic is 3.37 and the underlying p value is 0.01. On comparison of the computed p value with the significance level, the lower values comes out as p value which warrants H0 rejection based on the given evidence. Hence, H1 would be accepted (Flick, 2015).  The implication is that the slope coefficient is significant for the independent variable under consideration.

H0: βAnnual%change = 0 i.e. the slope coefficient of the given variable is not significant and thereby can be taken as zero.

H1: βAnnual%change ≠ 0 i.e. the slope coefficient of the given variable is significant and thereby cannot be taken as zero.

For the purpose of this hypothesis testing, the significance level is taken as 5%.

The hypothesis testing would be carried out based on the t statistic. Taking the multiple regression result into consideration, t statistic is -1.74 and the underlying p value is 0.11. On comparison of the computed p value with the significance level, the lower values comes out as significance level which does not warrant H0 rejection based on the given evidence. Hence, H1 would not be accepted (Medhi, 2016).  The implication is that the slope coefficient is not significant for the independent variable under consideration.

H0: βTotalArea = 0 i.e. the slope coefficient of the given variable is not significant and thereby can be taken as zero.

H1: βTotalArea ≠ 0 i.e. the slope coefficient of the given variable is significant and thereby cannot be taken as zero.

For the purpose of this hypothesis testing, the significance level is taken as 5%.

The hypothesis testing would be carried out based on the t statistic. Taking the multiple regression result into consideration, t statistic is 1.64 and the underlying p value is 0.14. On comparison of the computed p value with the significance level, the lower values comes out as significance level which does not warrant H0 rejection based on the given evidence. Hence, H1 would not be accepted (Hillier, 2016).  The implication is that the slope coefficient is not significant for the independent variable under consideration.

Annual % Change

H0: βAgeofhouse = 0 i.e. the slope coefficient of the given variable is not significant and thereby can be taken as zero.

H1: βAgeofhouse ≠ 0 i.e. the slope coefficient of the given variable is significant and thereby cannot be taken as zero.

For the purpose of this hypothesis testing, the significance level is taken as 5%.

The hypothesis testing would be carried out based on the t statistic. Taking the multiple regression result into consideration, t statistic is -2.20 and the underlying p value is 0.052. On comparison of the computed p value with the significance level, the lower values comes out as significance level which does not warrant H0 rejection based on the given evidence. Hence, H1 would not be accepted (Hastie, Tibshirani and Friedman, 2014). The implication is that the slope coefficient is not significant for the independent variable under consideration.

For the multiple regression model that has been developed, the R2 value is 0.7906. This highlights the fact that the joint variation in the given independent variables can account for 79.06% of the changes that are witnessed with regards to the dependent variable i.e. house market price.  As a result, there is about 21% of the dependent variable variation that is not accounted for by the given regression model. In such a scenario, it would be fair to conclude that the regression model is a good fit (Medhi, 2016).  

Based on the output of the multiple regression in Excel, the 95% confidence interval has been identified for the respective parameters which have been highlighted as follows..

The above confidence interval, highlight that the population slope coefficient of the respective variables would be contained within the boundaries of the interval computed and this claim has a probability of being 95% correct.  For example, the confidence interval with regards to age would represent that there is 95% likelihood that the slope coefficient of age based on the population would lie between -5.01 and 0.03 (Flick, 2015).

The revised regression model has been formed with house area as the only independent variable and house price being the dependent variable.

For the multiple regression model, the coefficient of determination is 0.7906 while it is only 0.0981 for the revised simple regression model. As a result, it would be fair to conclude that the revised simple regression model is not a good fit model owing to the poor predictive capacity of accounting for only 9.81% of the alternation seem in house market prices (Hillier, 2016). Besides, in case of the revised regression model, taking into cognizance the t statistics associated with slope coefficient along with corresponding p value, it would be fair to conclude that the significance of the slope coefficient is not established (Eriksson and Kovalainen, 2015).  Therefore, the conclusion can be drawn with regards to the superiority of the original multiple regression model on account of predictive capacity, better fit and significance of the model and atleast one slope coefficient.

Since the area of the building has been offered and no other information is given, hence price estimation needs to be carried out on the basis of the revised simple regression model whose underlying equation is referred as follows. 

References

Eriksson, P. and Kovalainen, A. (2015) Quantitative methods in business research. 3rd ed. London: Sage Publications.

Fehr, F. H. and Grossman, G. (2013). An introduction to sets, probability and hypothesis testing. 3rd ed. Ohio: Heath.

Flick, U. (2015) Introducing research methodology: A beginner’s guide to doing a research project. 4th ed. New York: Sage Publications.

Hair, J. F., Wolfinbarger, M., Money, A. H., Samouel, P., and Page, M. J. (2015) Essentials of business research methods. 2nd ed. New York: Routledge.

Hastie, T., Tibshirani, R. and Friedman, J. (2014) The Elements of Statistical Learning. 4th ed. New York: Springer Publications.

Hillier, F. (2016) Introduction to Operations Research. 6th ed. New York: McGraw Hill Publications.

Medhi, J. (2016) Statistical Methods: An Introductory Text. 4th ed. Sydney: New Age International.