Opt | Due: 5:30 pm, Saturday, December 15th |
1) Consider the following data set from an experiment
where the blocks are expected to behave differently.
Treatment Blocks 1 2 3 4 1 8.0 3.5 5.3 5.2 2 10.5 5.7 10.1 7.4 3 10.8 8.0 8.4 6.9 A) Conduct the appropriate nonparametric test of the null hypothesis that the effects of the treatments are the equal. What is your conclusion? B) Now conduct the appropriate nonparametric test if there was no blocking variable (just 3 people assigned to each treatment). What would th e conclusion be? C) Why does it make sense that the test in A had a smaller p-value than the test in B? D) Now imagine that the experimenter didn't realize the data was supposed to be in blocks and it got shuffled up. He decides that since the blocked design gives a smaller p-value that he'll make up some blocks and use those.
Treatment Blocks 1 2 3 4 1 10.8 3.5 10.1 6.9 2 10.5 5.7 5.3 5.2 3 8.0 8.0 8.4 7.4Conduct the same test as in A on this data set. What would the conclusion be? E) Based on the p-values for the three tests you've done, what conclusion can you reach about whether blocking is helpful or not? 2) Construct a data set of five points that has a Spearman's rho and a Kendall's tau of 1, but would have a correlation coefficient that is considerably smaller than 1. (A bonus point to the person with the smallest value). |
5 | Due: Tuesday, November 20 | 1) Find a possible data set for the project, say what hypothesis
you would test on it, and say whether the normality assumption seems met.
It can not be a data set from a statistics text book.
2) Consider problem 4 on page 365.
3) A simulation is run to compare the power of one-way ANOVA and the KW test when the data is heavy tailed (t with df=7) and from a small sample size (n=10). Ten thousand data sets are simulated where the actual difference in the means is 1. In 4,149 cases both tests reject the null; in 289 just the t-test rejects; and in 436 cases just the KW test rejects. What do you estimate the power of each test to be, and are these values statistically significantly different. |
4 | Due: Tuesday, October 23 | 1) Consider problem number 4 on page 63.
a) Give the formula you would use to find this value exactly. b) Use R to find that exact value. (Note that pbinom(x,size=n,prob=p) gives P[X<=x] for a binomial with parameters n and p). c) Use the central limit theorem with the continuity correction (the +/-0.5) to approximate the value to three decimal places. (There is a normal table in the back of the text, or pnorm(z) gives P[Z<=z] for a standard normal.) d) Use the central limit theorem without the continuity correction to approximate the value to three decimal places. Page 93 #7 (show all of your work) Page 113 #2. For part c, find the power when p is really 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, and 0.9. (Hint: The binomial distribution will give you the probabilities you need. dbinom(x,size=n,prob=p) in R will give you P[X=x] for a binomial with parameters n and p. |
3 | Due: Thursday, September 20th | A jury pool for a trial consists of 24 women and 18 men.
Tomorrow, twelve people will be selected from this pool to be on the jury and
the defense attorney is worried that there will be too many women
selected.
1) If the jury was selected at random, is the number of women selected
a binomial random variable, or a hypergeometric random variable?
You will probably want a calculator or computer program for calculating (e.g. excel or R) for this problem. Note that the function choose(n,k) in R will give you the binomial coefficients, e.g. 8!/(6!2!) is 8 choose 6 is choose(8,6) = 28. A ^ can be used to get powers, e.g. 23 is 2^3 = 8. A * is used for multiplication so that 2 times 4 is 2*4 = 8. |
2 | Due: Thursday, September 13th |
|
1 | Due: Thursday, September 6th |
The data set testdata.txt contains the scores of 88 students on a five part exam: 1) Closed Book on Mechanics (Calculus like Physics); 2) Closed Book on Vectors; 3) Open Book on Algebra; 4) Open Book on Analysis (The Theory of Calculus); and 5) Open Book in Statistics. Each exam is scored separately from 0 to 100. You may assume that the students are a simple random sample from the population of all
students taking this set of courses at this University.
It is desired to see if the mean score for all such students on the analysis exam differ from the mean score on the algebra exam. State the appropriate hypotheses and assumptions for conducting a paired t-test. Use both SAS and R to produce the results of the hypothesis test (at alpha=0.05) and any plots needed to check the assumptions. Be sure to include the code used and any menu options. Briefly summarize your findings. |