Management Report On Data Analysis For Business Decision Making

Open Data Sources and Fall-back Topics

  1. Inferential Statistics
  2. i.  Normal distribution:

Sample data is normally distributed when its distribution results in a bell-shaped graph that is symmetrical and centered about its mean. The spread of the data is described using standard deviation (D’Agostino, 2017). Some notable properties of the normal distribution are that the mean, median, and mode are equal, with the total area under the curve equal to 1. Sample data with a mean of 0 and standard deviation is said to have a standard normal distribution. Given a set of data that is normally distributed, we can calculate the mean and standard deviation then standardize he sample statistic by converting it into z-score using the formula:  (D’Agostino, 2017). The z-score shows how the sample statics varies from the population parameter.

Save Time On Research and Writing
Hire a Pro to Write You a 100% Plagiarism-Free Paper.
Get My Paper
  1. Sampling Distributions:

A sampling distribution is the probability distribution of a random variable derived from a random sample of data. The purpose of sampling distributions is to provide simple statistical inferences from sample data. A sampling distribution for any variable depends on the distribution of the population from which the sample was derived, the sampling technique and the sample size used (Lowry, 2014). For instance, different samples randomly derived from a normal population with a mean, µ and standard deviation, σ will have a normal sampling distribution of the sample mean. In inferential statistics, sampling distributions are important because they provide sample statics that are used to estimate different population parameters, which are very useful in making inferences about the entire population.

  1. Inferential Statistics:

Researchers use samples to approximate statistics about a population.  Inferential statistics use the data from a random sample taken form a population to make describe and make predictions about the given population (Lowry, 2014). Inferential statistics are important because the help researchers to reach a conclusion about a population based on a random sample of data. The characteristics of a population can be easily estimated based on a sample because inferential statistics allow for sample data to be summarized in a useful and informative manner.

  1. i. The Continuous Random Variable:

For this report, the random variable under consideration is “Average Time spent by guest in accommodation” which was measured in hours. The variable was obtained from sample data of 50 accommodation providers who responded to survey on the guests staying on their premises on an October mid-week night.

  1. Hypothesis Testing and Confidence Intervals:

A hypothesis is an assumption about a certain statistics of a random variable which is tested to determine the relationship and behavior of the dataset (Montgomery, Runger, and Hubele, 2009). Hypothesis testing is the statistical test used to determine whether the hypothesis assumed is a true representative of the entire population or not. The hypothesis is based on the results of a survey and is intend to provide meaningful inferences from the results. Hypothesis tasting is vital in inferential statics because the results obtained are the basis of the inferences made about the overall population from the sample data.

Save Time On Research and Writing
Hire a Pro to Write You a 100% Plagiarism-Free Paper.
Get My Paper

The common tests used to find the test static for a given set of data is the t-test of z-test. The t-test is based on the student t-distribution, and is preferred for data of sample size below 30 whose population standard deviation is unknown (Schlaifer, 1961). The z-test is based on the standard normal distribution for data with a large sample size (n>30), whose population standard deviation is known.

Team Code of Conduct

From the sample data, a confidence interval is calculated to provide the range of values that is likely to contain the unknown value of the entire population statistics (Schlaifer, 1961).  For instance, as a measure of central tendency we obtained the sample mean of the average time spent in accommodation from our sample data.

We can obtain a confidence interval from the sample mean which will give us the range of numbers for where the population mean is likely to fall. Essentially, different samples can be drawn from the same population, and the different samples are liable to produce different intervals for the mean(Lehmann and Romano, 2006).  The confidence level gives the specific proportion of the intervals that contain the real population mean. For instance, if we draw 40 random samples from the same population, a 95% confidence interval suggests that we would expect that 38 of the confidence intervals will contain the value of the population mean.

  1. ii. Hypothesis Test:

For our continuous random variable, we assume that the average time spent by a guest in the accommodation is 6.50 hours for a mid-week night of any month. To test this claim, we set up the null and alternate hypothesis as:

H0: µ = 6.85 hours

H1: µ ≠ 6.85 hours

For this two-tailed test, we set up the significance level to be α = 5%. Based o this significance level, the rejection region is determined and found as:

 Zα/2 = Z0.025 = ±1.645 (obtained from the standard normal distribution table)

The test statistic is calculated using the sample data, with the assumption that the null hypothesis is true. Our sample size, n = 50 which is greater than 30 hence it is considered large. Therefore, we use the Z-test to calculate the value of the test statics. The formula used is (Lehmann and Romano, 2006):

Z-test =  =  =  = 1.5326

Where,  is the sample mean calculated from the sample data in Excel.

µ is the population mean in the hypothesis

σ is the sample standard deviation calculated from the sample data in excel

n is the sample size of the sample data in Excel.

The decision rule is to fail to reject the null hypothesis if the value of the test statistics is greater than the Z-score. In this case, Z-test = 1.5326 < 1.645 = Zα/2. Therefore, the test statistic falls in the acceptance region.

In a nutshell, since the test statistic of 1.5326 and the critical value at 5 % significant level is 1.645, there is enough statistical evidence to accept the null hypothesis. We conclude that the average time spent in accommodation is equal to 6.85 hours for a midweek night of any month.

References

D’Agostino, R.B., 2017. Tests for the normal distribution. In Goodness-of-fit-techniques (pp. 367-420). Routledge.

Lehmann, E.L. and Romano, J.P., 2006. Testing statistical hypotheses. Springer Science & Business Media.

Lowry, R., 2014. Concepts and applications of inferential statistics.

Montgomery, D.C., Runger, G.C. and Hubele, N.F., 2009. Engineering statistics. John Wiley & Sons.

Schlaifer, R., 1961. Introduction to statistics for business decisions.