## Chapter 11 R examples: Simple Linear Regression # Reading the data into a temporary file called "my.datafile": my.datafile <- tempfile() cat(file=my.datafile, " 1 1 2 1 3 2 4 2 5 4 ", sep=" ") # Name the data set and give the variable (column) a name: stimulus <- read.table(my.datafile, header=FALSE, col.names=c("drug", "time")) # Alternatively, could type: # stimulus <- read.table("http://people.stat.sc.edu/hitchcock/reactiontime.txt", header=TRUE) attach(stimulus) # attaching the data frame # A Scatterplot of the data (in R, the X-variable goes first): plot(drug, time) # Finding the least squares estimates for the simple linear regression: # Make sure to put the response variable first! SLR.stimul <- lm(time ~ drug) SLR.stimul # A more detailed output: summary(SLR.stimul) # The estimated values of the y-intercept and slope are given under "Estimates". # We see that beta_0-hat is -0.1 and beta_1-hat is 0.7 # so the estimated SLR model can be written as # Y-hat = -0.1 + 0.7 X. # Now this model can be used for predicting Reaction Time # for any particular drug amount. # Suppose we want to do a test for model usefulness by testing H_0: beta_1 = 0. # The test statistic for this test is given under "t value", in the "DRUG_X" row. # We see that t=3.66 for this test. The p-value (for the TWO-TAILED alternative) # is given in the next column over, labeled "Pr(>|t|)". # We see the P-value is 0.0354. So if our significance level alpha is .05, # we would reject H_0 and conclude the model is useful. # Our estimate of sigma (which we call s) is given next to "Residual standard error". # Note that in this example, s=0.60553. This matches what we found in class. # Plotting the least squares line on top of the scatterplot of points: abline(SLR.stimul) # Finding the correlation coefficient between drug amount and reaction time: cor(drug, time) ################################################################################### ###################################################################################