To View the Datasets
library(help="datasets")
t-test
A t-test is a statistic that checks if two means are reliably different from each other
x <- c(1.2,3.4,1.3,-2.1,5.6,2.3,3.2,2.4,2.1,1.8,1.7,2.2)
y <- c(2.4,5.7,2.0,-3,13,5,6.2,4.8,4.2,3.5,3.7,5.2)
t.test(x,y)
##
## Welch Two Sample t-test
##
## data: x and y
## t = -1.9667, df = 15.943, p-value = 0.06688
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -4.7799367 0.1799367
## sample estimates:
## mean of x mean of y
## 2.091667 4.391667
Z-test
You can also embed plots, for example:
x <- read.csv("/home/ashim888/Dropbox/Madan Bhandari/Stat Tutorial/Day 2/ztest.txt",header=F)
x <- x[1:100,]
z <- sqrt(100) * (mean(x) - 0)/sd(x)
z
## [1] -0.2334861
Correlation
For more information on the function, type the command below
?cor()
Correlation basically goes from -1 to +1 (weak to strong) while measuring the relation let’s use correlation for our BullRiders Dataset
bull <- read.csv("~/Desktop/edx/Foundations of Data Analysis/BullRiders.csv")
cor(bull$YearsPro,bull$BuckOuts)
## [1] -0.1670275
Now i want to check correlation between three variables in a vector,
for that i need to do following steps
- create a vector
- find correlation of those variables in vector with the ones in dataset
myVars<-c("YearsPro","Events","BuckOuts")
cor(bull[,myVars])
## YearsPro Events BuckOuts
## YearsPro 1.0000000 -0.1597916 -0.1670275
## Events -0.1597916 1.0000000 0.9803737
## BuckOuts -0.1670275 0.9803737 1.0000000