Frequency Analysis Of Smartphone Entertainment Types And Statistical Analysis Of Income Distribution

December 21, 2023/

Uncategorized

Random Number Selection Process

The student ID number selected for this particular assignment is MIT17122. Thus, according to the provided guideline of selecting the relevant random numbers, the last three digits of the student ID are considered. Those are 122. Consequently, the random number selection procedure has been started from row number 22 and column number 1. As the random numbers are provided in sets of 6 digits, each set or block provides two random numbers (Hamman et al., 2016). The first and last three digits of each block represent two distinct random numbers of sizes three. In the corresponding excel sheet, the first column denotes the random number selected. Second column denotes the respective values of the random numbers selected. For instance, the first selected random number is 937 and so on (Chatterjee & Hadi, 2015). Third column describes whether the selected random number is “Good” or “Not-Good”. Good means the number can be selected as a sample number (Wilson, Bhatnagar & Townsend, 2017). Not good means it has to be rejected. Random numbers from 001 to 300 are selected otherwise it is rejected, including 000.

The selected samples are outlaid in the file named “SampleSmartPhoneData”, containing 50 samples from the provided list of 300.

As asked to provide, a Frequency Column Chart and a Relative Frequency Pie-chart has been constructed to depict the number of and proportions of different entertainment type (Wun et al., 2016).

As per the following frequency column chart, 21 of the samples contain entertainment in the form of Music, Videos and Movies (Weaver et al., 2018).

It is evident from the frequency column chart that that music, videos and movies are the most commonly downloaded form of entertainments.

0.18 of the sample proportion of entertainments are that of eBooks.

The table below shows the incomes from a higher to lower order. Corresponding CN numbers are also attached for convention.

CN	V1	Rank
60	$250,000	1.5
193	$250,000	1.5
140	$180,000	3.5
225	$180,000	3.5
72	$160,000	5
113	$155,000	6
114	$102,983	7
300	$101,262	8
137	$100,267	9
243	$100,200	10
252	$99,742	11
237	$99,398	14.5
242	$99,398	14.5
223	$99,398	14.5
248	$99,398	14.5
57	$99,398	14.5
165	$99,398	14.5
273	$99,374	18
46	$99,336	19
202	$98,955	20
241	$98,678	21
180	$98,673	22.5
205	$98,673	22.5
102	$98,645	24
249	$98,191	25
146	$97,756	26
277	$97,338	27.5
277	$97,338	27.5
134	$97,000	29
293	$96,286	30
98	$95,957	31
88	$95,931	32
49	$95,877	33
131	$95,297	34
62	$95,000	35
153	$93,250	36
153	$93,250	37
91	$90,164	38
221	$90,025	39
169	$88,887	40
123	$72,000	41
268	$70,000	43
176	$70,000	43
176	$70,000	43
2	$62,500	45.5
251	$62,500	45.5
4	$55,000	47
234	$45,000	48
234	$45,000	49
25	$40,000	50

The formula to determine the location of the percentile, that is to find the value of the corresponding percentile from the data provided, is as follows –

; Where n is the total number of observations and P is defined as the desired percentile.

Here, the desired percentile is 70. Thus P =70. Substituting the value of P and considering n = 50, the location of the parameter is found out to be –

It can be written as IR+FR=35+0.7=35.7

The value with rank 35 is $95000 and the value of 36^th rank element is $93,250. Further to determine the exact value corresponding to the 70^th percentile, the formula used is –

Frequency Analysis of Smartphone Entertainment Types

0.7 (95000-93250) + 93250

= 0.7*1750+93250

=1125+93250

= $ 94475

CN	V1	Rank
60	$250,000	1
193	$250,000	1
140	$180,000	2
225	$180,000	2
72	$160,000	3
113	$155,000	4
114	$102,983	5
300	$101,262	6
137	$100,267	7
243	$100,200	8
252	$99,742	9
237	$99,398	10
242	$99,398	10
223	$99,398	10
248	$99,398	10
57	$99,398	10
165	$99,398	10
273	$99,374	11
46	$99,336	12
202	$98,955	13
241	$98,678	14
180	$98,673	15
205	$98,673	15
102	$98,645	16
249	$98,191	17
146	$97,756	18
277	$97,338	19
277	$97,338	19
134	$97,000	20
293	$96,286	21
98	$95,957	22
88	$95,931	23
49	$95,877	24
131	$95,297	25
62	$95,000	26
153	$93,250	27
153	$93,250	28
91	$90,164	29
221	$90,025	30
169	$88,887	31
123	$72,000	32
268	$70,000	33
176	$70,000	33
176	$70,000	33
2	$62,500	34
251	$62,500	34
4	$55,000	35
234	$45,000	36
234	$45,000	36
25	$40,000	37

The first and third quartiles represent the 25^th and 75^th percentile. The calculations are carried out in a similar fashion. To determine the 25^th percentile value,

This can be expressed as 12+ 0.75 = IR+FR =12.75

The value with rank 12 is $99,336 and the value of 13^th rank element is $98,955. Further to determine the exact value corresponding to the 25^th percentile, the formula used is –

0.75* (99336-98955) + 98995

= 0.75*381+98995

=285.75+93250

= $ 99280.75

CN	V1	Rank
60	$250,000	1.5
193	$250,000	1.5
140	$180,000	3.5
225	$180,000	3.5
72	$160,000	5
113	$155,000	6
114	$102,983	7
300	$101,262	8
137	$100,267	9
243	$100,200	10
252	$99,742	11
237	$99,398	14.5
242	$99,398	14.5
223	$99,398	14.5
248	$99,398	14.5
57	$99,398	14.5
165	$99,398	14.5
273	$99,374	18
46	$99,336	19
202	$98,955	20
241	$98,678	21
180	$98,673	22.5
205	$98,673	22.5
102	$98,645	24
249	$98,191	25
146	$97,756	26
277	$97,338	27.5
277	$97,338	27.5
134	$97,000	29
293	$96,286	30
98	$95,957	31
88	$95,931	32
49	$95,877	33
131	$95,297	34
62	$95,000	35
153	$93,250	36
153	$93,250	37
91	$90,164	38
221	$90,025	39
169	$88,887	40
123	$72,000	41
268	$70,000	43
176	$70,000	43
176	$70,000	43
2	$62,500	45.5
251	$62,500	45.5
4	$55,000	47
234	$45,000	48
234	$45,000	49
25	$40,000	50

In order to find the 75th percentile, proceeding in a similar fashion, we get

This can be expressed as 38+ 0.25 = IR+FR =38.25

The value with rank 38 is $90,164 and the value of 39^th rank element is $90,025. Further to determine the exact value corresponding to the 75^th percentile, the formula used is –

0.25* (90,164-90025) + 90025

= 0.75*139+90025

=34.75+90025

= $ 90059.75

Before answering this specific question, it is important to clarify the idea of percentiles. Percentile refers to the percentage of population above a certain point. For instance, 70^th percentile would mean the no of people of above that specific value. In this particular case, the value is found out to be $94475. Which implies that among the total 50 selected samples, 70 percent of them have the annual income of above $94475.

Inter quartile range is defined as the difference between the third quartile and the first quartile. Thus, the inter quartile range in this case is –

= 90059.75 – 99280.75 = $9221

Inter quartile range is determined with primary focus on the deviation or variation within a data set. Inter quartile range basically provides an idea about the 50% of the values spread across the mean or the average. Thus, in this case the inter quartile range is $9221. This implies that the annual income of the 50% of middle range of the provided data is spread within a range of 9221.

The following descriptive statistics table has been constructed in excel and then pasted here.

Column1
Mean	101554.46
Standard Error	5815.206028
Median	97973.5
Mode	99398
Standard Deviation	41119.71617
Sample Variance	1690831058
Kurtosis	5.801565461
Skewness	2.114964151
Range	210000
Minimum	40000
Maximum	250000
Sum	5077723
Count	50

The upper and lower inner fences are calculated by the provided formulae.

103891.3

85449.25

The suitable measure of central tendency chosen is the mean or the average. Among all the other measures of central tendency, viz. median, mode and others, Mean is regarded as the best measure. Thus it is chosen primarily for this purpose. Also since there the inter quartile range or even the range shows that the data is well spread, median and mode will not be the best choice. The mean is defined as –

Percentiles and Quartiles of Income Distribution

The suitable measure of dispersion chosen for this particular set of data is standard deviation (SD). SD is defined as the square root of the sum of the squares of deviations from the mean. Since the measure of central tendency is chosen as the mean, it is convenient from a practical perspective to use the standard deviation to calculate the level of dispersion. SD is defined as –

The V1 variable is defined as the annual income of the samples under consideration. The mean, as mentioned above, is found out to be $101554.46. This implies on an average the annual income of the 50 samples is the aforementioned amount. This may not seem like a middle or central value as the incomes range from $250,000 to $40,000.

The median or the middle most value of the entire data set is calculated as $97973.5. This means that half of the observation set, that is income of 50% of the observations lie above this value point and consequently rest lie underneath this point. The median also depicts that the majority of the people have income in the vicinity of the mentioned value.

Quartiles are referred to the groups or sections when the entire data set is divided in four of them. All of the quartiles values are calculated till now. Here,

The first quartile provides the value above which 25% of the observations lie. Consequently third quartile does the same with that of 75% of the observations. Median or the 50^th percentile or the second quartile is the middle value of the data. This means 50% of the observations are above this value and the rest are below.

Measures of variation include the range and the sample SD and the sample variance. All the values are calculated through excel and mentioned above. The values are found out to be

Standard Deviation	41119.71617
Sample Variance	1690831058
Range	210000

Clearly the SD and the Variance are very high. The range also indicates the dispersion of the data set.

The three measures that help in recognizing whether the data is follows a normal distribution or not are – Mean Median and Skewness. In case of Normal distribution, Mean, Median and Mode shall all be equal. That is not the case for this particular data set (Leamer, 2016). The Skewness is also high as Skewness for a normal distribution tends to zero. Thus the data does not follow a Normal Population.

Here, the following table is drawn to conclude the number observations within the asked range.

CN	V1	Z
25	$40,000	-1.49696
234	$45,000	-1.37536
234	$45,000	-1.37536
4	$55,000	-1.13217
2	$62,500	-0.94977
251	$62,500	-0.94977
268	$70,000	-0.76738
176	$70,000	-0.76738
176	$70,000	-0.76738
123	$72,000	-0.71874
169	$88,887	-0.30806
221	$90,025	-0.28039
91	$90,164	-0.27701
153	$93,250	-0.20196
153	$93,250	-0.20196
62	$95,000	-0.1594
131	$95,297	-0.15218
49	$95,877	-0.13807
88	$95,931	-0.13676
98	$95,957	-0.13613
293	$96,286	-0.12812
134	$97,000	-0.11076
277	$97,338	-0.10254
277	$97,338	-0.10254
146	$97,756	-0.09238
249	$98,191	-0.0818
102	$98,645	-0.07076
180	$98,673	-0.07007
205	$98,673	-0.07007
241	$98,678	-0.06995
202	$98,955	-0.06322
46	$99,336	-0.05395
273	$99,374	-0.05303
237	$99,398	-0.05244
242	$99,398	-0.05244
223	$99,398	-0.05244
248	$99,398	-0.05244
57	$99,398	-0.05244
165	$99,398	-0.05244
252	$99,742	-0.04408
243	$100,200	-0.03294
137	$100,267	-0.03131
300	$101,262	-0.00711
114	$102,983	0.034741
113	$155,000	1.299755
72	$160,000	1.421351
140	$180,000	1.907736
225	$180,000	1.907736
60	$250,000	3.610082
193	$250,000	3.610082

Measures of Central Tendency and Variation of Income Distribution

The z scores are defined 1.5 and -1.5. From the standard normal table, the value found out is 0.43319. For both sides, the total of 86.638 % observations lies between the mentioned regions (Wan et al., 2014). This means about 43 observations lie in between the specified region.

The following table has been constructed to provide an idea of the region asked for.

CN	V1	TRUE/FALSE
25	$40,000	TRUE
234	$45,000	TRUE
234	$45,000	TRUE
4	$55,000	TRUE
2	$62,500	TRUE
251	$62,500	TRUE
268	$70,000	TRUE
176	$70,000	TRUE
176	$70,000	TRUE
123	$72,000	TRUE
169	$88,887	TRUE
221	$90,025	TRUE
91	$90,164	TRUE
153	$93,250	TRUE
153	$93,250	TRUE
62	$95,000	TRUE
131	$95,297	TRUE
49	$95,877	TRUE
88	$95,931	TRUE
98	$95,957	TRUE
293	$96,286	TRUE
134	$97,000	TRUE
277	$97,338	TRUE
277	$97,338	TRUE
146	$97,756	TRUE
249	$98,191	TRUE
102	$98,645	TRUE
180	$98,673	TRUE
205	$98,673	TRUE
241	$98,678	TRUE
202	$98,955	TRUE
46	$99,336	TRUE
273	$99,374	TRUE
237	$99,398	TRUE
242	$99,398	TRUE
223	$99,398	TRUE
248	$99,398	TRUE
57	$99,398	TRUE
165	$99,398	TRUE
252	$99,742	TRUE
243	$100,200	TRUE
137	$100,267	TRUE
300	$101,262	TRUE
114	$102,983	TRUE
113	$155,000	TRUE
72	$160,000	TRUE
140	$180,000	FALSE
225	$180,000	FALSE
60	$250,000	FALSE
193	$250,000	FALSE

It is evident 46 of the observations fall in the given region.

The regression equation is.

The primary purpose of this model is to test whether there is a linear relation between the age and percentage of phone usage for work purposes (Abdullah, Doucouliagos & Manning, 2015). If there is a linear relation between these two then will not be equal to zero.

Since the value of is 0.608, the variables are related in a linear positive manner.

The intercept coefficient provides the least square estimate of.

The slope coefficient is the value corresponding to age in the Coefficients column (Kolaczyk & Csárdi, 2014). That is also. Thus as described above, it represents a positive linear relation.

Coefficient of determination is the R-squared value in the table. It is 0.126743917. It means 12.6% of the variation of the variable around its mean can be described through the other regressors.

References

Hamman, R. D., Kennedy III, W. C., Rump, W. J., & Irwin, K. E. (2016). U.S. Patent Application No. 15/152,009.

Wun, T., Payne, J., Huron, S., & Carpendale, S. (2016, June). Comparing bar chart authoring with Microsoft Excel and tangible tiles. In Computer Graphics Forum (Vol. 35, No. 3, pp. 111-120).

Weaver, K. F., Morales, V., Dunn, S. L., Godde, K., & Weaver, P. F. (2018). Showing Your Data. An Introduction to Statistical Analysis in Research: With Applications in the Biological and Life Sciences, First, 61-190.

Wan, X., Wang, W., Liu, J., & Tong, T. (2014). Estimating the sample mean and standard deviation from the sample size, median, range and/or interquartile range. BMC medical research methodology, 14(1), 135.

Kolaczyk, E. D., & Csárdi, G. (2014). Statistical analysis of network data with R(Vol. 65). New York: Springer.

Leamer, E. E. (2016). S-values: Conventional context-minimal measures of the sturdiness of regression coefficients. Journal of Econometrics, 193(1), 147-161.

Draper, N. R., & Smith, H. (2014). Applied regression analysis. John Wiley & Sons.

Harrell Jr, F. E. (2015). Regression modeling strategies: with applications to linear models, logistic and ordinal regression, and survival analysis. Springer.

Marcolini, G., Bellin, A., & Chiogna, G. (2017). Performance of the Standard Normal Homogeneity Test for the homogenization of mean seasonal snow depth time series. International Journal of Climatology.

Mowery, D. C., Nelson, R. R., Sampat, B. N., & Ziedonis, A. A. (2015). Ivory tower and industrial innovation: University-industry technology transfer before and after the Bayh-Dole Act. Stanford University Press.

Chatterjee, S., & Hadi, A. S. (2015). Regression analysis by example. John Wiley & Sons.

Wilson, L., Bhatnagar, P., & Townsend, N. (2017). Comparing trends in mortality from cardiovascular disease and cancer in the United Kingdom, 1983–2013: joinpoint regression analysis. Population health metrics, 15(1), 23.

Abdullah, A., Doucouliagos, H., & Manning, E. (2015). Does education reduce income inequality? A meta?regression analysis. Journal of Economic Surveys, 29(2), 301-316.

Turn in your highest-quality paper
Get a qualified writer to help you with

“ Frequency Analysis Of Smartphone Entertainment Types And Statistical Analysis Of Income Distribution ”

Get high-quality paper

NEW! AI matching with writer

Connect With Us

Frequency Analysis Of Smartphone Entertainment Types And Statistical Analysis Of Income Distribution

Random Number Selection Process

Frequency Analysis of Smartphone Entertainment Types

Percentiles and Quartiles of Income Distribution

Measures of Central Tendency and Variation of Income Distribution

Charter Development For Stud Farm Online Management System Project

Report On Organisational Performance Of An Allocated Company

Marketing Analysis For The Retail Industry: A Report By MIT University Students

Developing An Audit Program For A Selected Listed Company

Organizational Factors That Hinder Employee Productivity: A Survey Study

Legal Issues Arising From Two Cases

Company

Services

Products