University of The Cumberlands Descriptive Statistics Problems Paper

University of The Cumberlands Descriptive Statistics Problems Paper

Q1: By providing example, discuss the roles of decision tree in Big Data Analytics.

Q2: Given the yearly sales in yearly_sales .csv file, complete the following:

Show all the descriptive statistics of sales_total, including its standard deviation and variance.

Correlation of number_of_order to sales_total.

Plot the scatter graph of number_of_order to sales_total.

Perform linear regression of number_of_order to sales_total.

Draw the line of best fit (abline) over your graph.

Perform T test as shown below and show your conclusion.

Perform ANOVA test as shown below and show your conclusion.

T test

This is to test for the mean of one group; here we have sale_total.

t.test(sales_total, mu = 249) # R command for t test

H0:mu = 249 # null hypothesis

H1: mu ≠ 249 # alternative hypothesis

Confidence level = 0.05

Do not Reject H0 if p-value is <= 0.05

Reject H0 if p-value is > 0.05

ANOVA test

ANOVA is used to test the equality of mean for two groups; here we have Male and Female.

Anova(lm(data = myData, sales_total ~ factor(gender))) # R command for ANOVA

H0: There is significant difference between Male and Female sales_total.

H1: There is no significant difference between Male and Female sales_total.

Confidence level = 0.05

Do not Reject H0 if p-value is <= 0.05

Reject H0 if p-value is > 0.05