Analysis Of Variance (ANOVA) Table, Hypothesis Tests, And Prediction For Variance Inflation Factor

December 21, 2023/

Uncategorized

Description of Data

The data is collected is for 382 days between 2014 and 2016. The data has twelve variables; the first variable is date, the other ten are used as independent variables while the twelve variable is used as the dependent variable. The variables indicate the variations in prices of financial assets for the days on which the data was collected. Some of the variables and their purposes are listed below:

Date: Indicating the year upon which the data was collected
Aluminum_Vel1: Indicating change in prices of aluminum backdated by a day
Copper_Vel1: Indicating change in prices of copper backdated by a day
US_Gasoline_Vel1: Indicating change in prices of Gasoline backdated by a day.
West_Texas_Vel1: Indicating change in price of West Texas Intermediate Oil by backdated by a day.
SPDR_XL1: Indicating the U.S. industrial confidence for industrial-oriented firms.
CA-Dollar_Vel1: The exchange rate between U.S. and Canadian dollar backdated by a day.
SP 500: The standard and Poor’s 500 index of stock prices.

The data also consists of interaction variables which the product of two variables to create new variables. These variables are:

Year x WERN
30year x Copper_Vel1
Aluminum_Vel1 x Aluminum_Vel1
Aluminum_Vel1 x West_Texas_Vel1
Baltic_Vel1 x Copper_Vel1
SPDR_XL1 x West_Texas_Vel1

The variable contained in the last column of the data is the dependent variable that needs to be predicted. It has been sorted and ranked with the initial percentage variations in price divided by the total number of rows in the data so that they vary from 0 to 1. Zero is the maximum decrease in price, the median 0.5 indicates no change in price and 1 indicates the maximum increase in price.

The variance inflation factor (VIF) quantifies the magnitude of inflation of the variance. It is used to test for multi-collinearity in the variables of the data set. Multi-collinearity is defined as the existence of a high correlation between more than one independent variables. The existence of multi-collinearity in a dataset creates a difficulty of fitting a regression model into the dataset (Hinton, 2014).

The VIF’s were determined using the PHStat excel plug in on each of the independent variables and are as indicated in the appendix. Since these values are less than five, there is little or no correlation between the independent variables and therefore, they are independent and none of them needs to be ignored while creating the prediction model.

Normal probability plots are utilized in statistics to identify any observable departure from normality of a value. This includes kurtosis, skewness and outliers. Here the plot is used to indicate the presence of outliers in the residuals.

The plot is slightly non-linear at the 50^th percentile as a result of less trading days for which the share price was stagnant making these days rank equivalent to 0.5. However, the plot can be assumed to be linear indicating that there are no observable outliers. A conclusion can be reached that there is a normal distribution of the residuals and that any hypothesis test that will be performed will be accurate and reliable.

When carrying out regression, the regression analysis model assumes that there is a normal distribution of the standard residuals. The standard residue histogram is used to visualize the distribution of the standard residuals.

The histogram indicates a normal distribution with just a little skewness to the right. However, the skewness is assumed and a conclusion of normal distribution is reached. This means that hypothesis test and predictions carried out will be reliable and safe.

Analysis of Variance (ANOVA) table is a technique used in statistics to perform hypothesis test to determine whether or not there exists a relationship between the dependent and the independent variables. The null hypothesis is that there is no linear relationship between the variable, while the alternative hypothesis is that there is a linear relationship between the variables (Foster, Barkus & Yavorsky, 2016). The null hypothesis is rejectected when the value of significance F is less than the overall significance level and fails to be reject if the inverse is true. In this case the value of significance F is .

Variance Inflation Factor (VIF)

This value is less than the overall significance level of 0.05 and therefore there is sufficient evidence in favor of the alternative hypothesis and thus the null hypothesis can be rejected. It is therefore sufficient to conclude that there exists a relationship between the dependent variable and either independent variables.

The coefficient of determination or the r-squared value is used to indicate the number of points. The table below shows the regression table with the coefficient of determination.

The coefficient of determination in this case is 14.76% indicating that only 14.76% of the variation of the dependent variable (future change in price) around the mean are explained by the independent variables. The value is too low and therefore the regression model does not give that much description of the future change in share price for the Werner enterprises.

If the future share price for the enterprise was unpredictable, then the value of the coefficient of determinant would be zero, in this case the value is not zero and therefore the stock market is not perfectly random and predictions can still be made.

We look at the respective p-values of the independent variables and compare them against the overall significance level of the model. If the p-value of any of the independent variables is less than the overall significance level, then we conclude that the variable is statistically significant and can be used in the model, otherwise the variable is said to statistically insignificant and is dropped from the model.

From the table, every explanatory variable has a p-value less than the overall significance level of the whole model, there they are all statistically significant and none of them can be dropped from the regression model.

The coefficient of independent variables above indicates how each of them impact the dependent variable when all other independent variables are kept constant.

The largest positive coefficient is for the interaction variable aluminum_vel1 x West_Texas_vel1. The coefficient is 0.5711 meaning that there would be an increase by 0.5711 of the future share price for a unit increase in this interaction variable. On the other hand, the largest negative coefficient is of the interaction variable “SPDR_XL1 x West_Texas_Vel1. The coefficient is -0.487 and means that the future input share price would reduce by factor for a unit change of this interaction variable.

It is evident that none of the coefficients is equal of very much close to zero and therefore, every independent variable is related to the dependent variable to some extent and thus none is eligible of being deleted from the regression model.

To predict the future share in price the ANOVA table and the confidence interval estimate & prediction is used. Here, the historical data is used to derive a prediction of the variation in the share price and then compare it with actual achieved change in share price

It is evident that the predicted values of the share price are nowhere close to the actual values of the share price. The two last rows indicate the limits between which the actual share price is expected to fall within. It can be noted that the actual values of share price fall within the limits within which they are expected from the prediction indicating that the model is sufficient and error free.

In all the sampled days the difference between the actual and the predicted share price is within a maximum deviation of ±0.3. With the value of the coefficient of determinant being so low, it can be said that the wide prediction interval does not guarantee a consistent prediction of the share price either going up or going down.

Conclusion

Multiple linear regression has been applied to historic data to predict the future change in the share price of the Werner Enterprises, Inc. The VIF indicated that there was no correlation between the independent variables. Residuals analysis indicated that the dataset under consideration was normally distributed and therefore valid for hypothesis test. The coefficient of determinant indicated that only 14.76% of the dependent variable around the mean was explained by the independent variables. The analysis of variance indicated there was a relationship between the dependent and the independent variables. The prediction performed indicated that there was a margin of difference between the actual change in share price and the predicted value of share price. This change could be attributed to the value of the coefficient of determination being so low. However, if the value of the coefficient of determination was large enough then accurate and reliable prediction would be achievable.

References

Foster, J. J., Barkus, E., & Yavorsky, C. (2006). Understanding and using advanced statistics (2^nd ed). London: SAGE.

Hinton, P. R. (2014). Statistics explained (3^rd ed). London: Routledge, Taylor & Francis Group.

Turn in your highest-quality paper
Get a qualified writer to help you with

“ Analysis Of Variance (ANOVA) Table, Hypothesis Tests, And Prediction For Variance Inflation Factor ”

Get high-quality paper

NEW! AI matching with writer

Report On Organisational Performance Of An Allocated Company

Components of the Report The report is developed to provide an analysis of the organizational performance of a selected company listed on ASX. It provides an evaluation of the financial performance of the selected company with analyzing its profitability position. This is carried out by providing comparison of the current year financial figure of the […]

Marketing Analysis For The Retail Industry: A Report By MIT University Students

Task Overview The term marketing management means giving directions to an organization’s resources in order to implement the best possible strategy so as to meet the desires of the customers. The goal should be maximize the sales of the particular product or service. The person responsible for overseeing, planning the development of new products is […]

Developing An Audit Program For A Selected Listed Company

Introduction and Project Requirements Auditing is an independent examination of financial information of an entity whether profit making or not and irrespective of its size that is small, medium or big or its legal form whether it is a company(whether listed or unlisted), partnership or any other body corporate and when such examination is conducted […]

Organizational Factors That Hinder Employee Productivity: A Survey Study

Thesis statement 1). Thesis statement Save Time On Research and Writing Hire a Pro to Write You a 100% Plagiarism-Free Paper. Get My Paper Due to low productivity in most organizations, high labor turnover and job unsatisfactory, several issues emerge on whether there are factors that may hinder or motivate employees from working and producing […]

Legal Issues Arising From Two Cases

Case 1: Liability of Meghan and Rachel for Misappropriation of Funds by Charles Windsor III Facts Save Time On Research and Writing Hire a Pro to Write You a 100% Plagiarism-Free Paper. Get My Paper Harry Spencer,a long-time client of Charles Windsor & Sons Solicitors, gave Charles Windsor III some money to invest on his […]

Charter Development For Stud Farm Online Management System Project

Part One MOV – Measurable Organisational Value Save Time On Research and Writing Hire a Pro to Write You a 100% Plagiarism-Free Paper. Get My Paper Area of Impact Rank (1 to 5) Operational 1 Save Time On Research and Writing Hire a Pro to Write You a 100% Plagiarism-Free Paper. Get My Paper Strategy […]

Connect With Us