Statistics Exam Scores, Regression Analysis, And ANOVA: Solutions And Interpretations

Frequency Distribution of Examination Scores

1. The frequency distribution tables were constructed in MS Excel and have been provided below.

Table 1: Frequency Distribution of Examination Scores

Save Time On Research and Writing
Hire a Pro to Write You a 100% Plagiarism-Free Paper.
 CLASS INTERVAL LOWER BOUND UPPPER BOUND CLASS MID-POINT Save Time On Research and Writing Hire a Pro to Write You a 100% Plagiarism-Free Paper. FREQUENCY 50-60 50 59 54.5 3 60-70 60 69 64.5 2 70-80 70 79 74.5 5 80-90 80 89 84.5 4 90-100 90 99 94.5 6

Table 2: Cumulative Frequency Distribution of Examination Scores

 CLASS INTERVAL LOWER BOUND UPPPER BOUND CLASS MID-POINT FREQUENCY CUMULATIVE FREQUENCY 50-60 50 59 54.5 3 3 60-70 60 69 64.5 2 5 70-80 70 79 74.5 5 10 80-90 80 89 84.5 4 14 90-100 90 99 94.5 6 20

Table 3: Relative Frequency Distribution of Examination Scores

 CLASS INTERVAL LOWER BOUND UPPPER BOUND CLASS MID-POINT FREQUENCY RELATIVE FREQUENCY 50-60 50 59 54.5 3 0.15 60-70 60 69 64.5 2 0.10 70-80 70 79 74.5 5 0.25 80-90 80 89 84.5 4 0.20 90-100 90 99 94.5 6 0.30

Table 4: Cumulative Relative Frequency Distribution of Examination Scores

 CLASS INTERVAL LOWER BOUND UPPPER BOUND CLASS MID-POINT FREQUENCY RELATIVE FREQUENCY CUMULATIVE RELATIVE FREQUENCY 50-60 50 59 54.5 3 0.15 0.15 60-70 60 69 64.5 2 0.10 0.25 70-80 70 79 74.5 5 0.25 0.50 80-90 80 89 84.5 4 0.20 0.70 90-100 90 99 94.5 6 0.30 1.00

Table 5: Percent Frequency Distribution of Examination Scores

 CLASS INTERVAL LOWER BOUND UPPPER BOUND CLASS MID-POINT FREQUENCY PERCENT FREQUENCY 50-60 50 59 54.5 3 15.00% 60-70 60 69 64.5 2 10.00% 70-80 70 79 74.5 5 25.00% 80-90 80 89 84.5 4 20.00% 90-100 90 99 94.5 6 30.00%

From percentage distribution of students based on their examination scores revealed that the nature of the histogram (percentage distribution graph). Accumulation of most of the students was observed above examination score of 70. Left skewness was evident from the shape of the histogram (Black, 2009). It can be inferred that students are getting very good marks in examination.

1. It is known that the “degrees of freedom” of residual in a regression model is taken as (n-p-1), where p is the degrees of freedom of the regression model. Here , so the value of n (sample size) is calculated as 41.
2. The margin of error at 5% level of significance is and the confidence interval of the slope is [0.029 – 0.041, 0.029 + 0.041] = [-0.012, 0.07]. The standard error of the slope indicates that the values of supply are very closely accumulated around the regression line of X, unit price (in thousands of dollars).
3. The given regression table is incomplete, where the values of SS for regression model and residue are provided. Now coefficient of determinant is known as , where SST = SSE + SSM = 7389.951. Hence, coefficient of determinant is calculated to be . This implies that 95.2% variation of supply (Y) is explained by the unit price (X).
4. Coefficient of correlation implies that the supply (Y) and unit price (X) are positively correlated. The value of the correlation indicated a steep increase in supply for change in unit price.
5. The regression equation is

Now for Unit Price = \$ 50,000, the dependent variable is calculated as (in thousands of units). Hence predicted supply is approximately 55,530 units (Black, 2011).

1. The ANOVA table is given in table 6 (constructed in MS Excel).

Table 6: ANOVA for programs in Allied Corporation

 ANOVA: Single Factor Groups Count Sum Average Variance Program A 5 725 145 525 Program B 5 675 135 425 Program C 5 950 190 312.5 Program D 5 750 150 637.5 ANOVA Source of Variation SS df MS F P-value F crit Between Groups 8750 3 2916.666667 6.1403 0.0055 3.2388 Within Groups 7600 16 475 Total 16350 19

The suggestion to Allied Corporation was provided based on the ANOVA results. The exploratory analysis shows that average output of day’s work for the C program is the maximum with average value of 190. The inferential analysis of ANOVA established that indeed, the output from C program is significantly (F = 6.14, p < 0.05) greater than that of from other three programs. Allied corporation was suggested to assign all of their employees in program C (Triola, 2013).

1. The table 7 shows the regression output at 5% level of significance, calculated in Excel. The regression equation from the model is formed as,

Table 7: Regression Model for Weekly Sales and Price of Competitor’s Product at 5% Level of significance

 SUMMARY OUTPUT Regression 95% Regression Statistics Multiple R 0.8778 R Square 0.7706 Adjusted R Square 0.6558 Standard Error 1.8374 Observations 7 ANOVA df SS MS F Significance F Regression 2.0000 45.3528 22.6764 6.7168 0.0526 Residual 4.0000 13.5043 3.3761 Total 6.0000 58.8571 Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Intercept 3.5976 4.0522 0.8878 0.4248 -7.6532 14.8484 Price 41.3200 13.3374 3.0981 0.0363 4.2896 78.3505 Advertising 0.0132 0.3276 0.0404 0.9697 -0.8963 0.9228

The table 8 shows the regression output at 10% level of significance. The model is significant and that is inferred from the significance level (F = 6.72, p = 0.0526) of the model (Montgomery, Peck, and Vining, 2012).

Table 8: Regression Model for Weekly Sales and Price of Competitor’s Product at 10% Level of significance

 SUMMARY OUTPUT Regression 90% Regression Statistics Multiple R 0.8778 R Square 0.7706 Adjusted R Square 0.6558 Standard Error 1.8374 Observations 7 ANOVA df SS MS F Significance F Regression 2.0000 45.3528 22.6764 6.7168 0.0526 Residual 4.0000 13.5043 3.3761 Total 6.0000 58.8571 Coefficients Standard Error t Stat P-value Lower 90.0% Upper 90.0% Intercept 3.5976 4.0522 0.8878 0.4248 -5.0411 12.2364 Price 41.3200 13.3374 3.0981 0.0363 12.8868 69.7532 Advertising 0.0132 0.3276 0.0404 0.9697 -0.6851 0.7116

Individual relation of the independent variables reveal that advertisement expenditure (X2) is not significantly (t = 0.04, p =0.97) related to sales of the product (Y), whereas unit price of the competitors’ (X1) is significantly (t = 3.1, p < 0.1) related to sales of the product (Y) (Cox, and Wermuth, 2014).

1. The table 9 shows the regression output at 10% level of significance with unit price of competitors’ (X2) as the only variable (Vittinghoff et al., 2011). The new regression equation is calculated as

Table 9: Regression Model for Weekly Sales and Price of Competitor’s Product at 10% Level of significance with Single Factor

 SUMMARY OUTPUT Regression 90% Regression Statistics Multiple R 0.8778 R Square 0.7705 Adjusted R Square 0.7246 Standard Error 1.6438 Observations 7 ANOVA df SS MS F Significance F Regression 1.0000 45.3473 45.3473 16.7831 0.0094 Residual 5.0000 13.5098 2.7020 Total 6.0000 58.8571 Coefficients Standard Error t Stat P-value Lower 90.0% Upper 90.0% Intercept 3.5818 3.6082 0.9927 0.3664 -3.6889 10.8525 Price 41.6031 10.1552 4.0967 0.0094 21.1398 62.0663

Slope of the independent variable, unit price of competitors’ (X2) is 41.60 (t = 4.09, p < 0.05) that indicates existence of a positive correlation with the sales of the products (Y). Hence, from regression equation it can be said that sales of product (Y) increases by 41.60 units for one unit increase in competitors’ price of products (Burns, Bush, and Sinha, 2014).

References

Black, K., 2009. Business statistics: Contemporary decision making. John Wiley & Sons.

Black, K., 2011. Business statistics: for contemporary decision making. John Wiley & Sons.

Burns, A.C., Bush, R.F. and Sinha, N., 2014. Marketing research (Vol. 7). Harlow: Pearson.

Cox, D.R. and Wermuth, N., 2014. Multivariate dependencies: Models, analysis and interpretation. Chapman and Hall/CRC.

Montgomery, D.C., Peck, E.A. and Vining, G.G., 2012. Introduction to linear regression analysis (Vol. 821). John Wiley & Sons.

Triola, M.F., 2013. Elementary statistics using Excel. Pearson.

Vittinghoff, E., Glidden, D.V., Shiboski, S.C. and McCulloch, C.E., 2011. Regression methods in biostatistics: linear, logistic, survival, and repeated measures models. Springer Science & Business Media.