SCCC 312 - Homework Assignment 2
The assignment uses the data set tuna.txt posted on the web
at http://www.stat.sc.edu/~habing/courses/data/tuna.txt.
The first column is the age of
the tuna (in days) estimated by counting the
rings
in its otoliths. The second
column is the fork length (length from fork in tail to nose) in cm. The 90 fish
in this case were caught in the Indian Ocean.
As this assignment will require use of MINITAB, we must first get the data loaded
into MINITAB. The easiest way to do this is to simply cut and paste from the
web page into the worksheet... MINITAB is smart enough to figure out which
values go in which columns. Another way of doing this is by saving the text file onto your z-drive by
using
your web-browser. In MINITAB, under the File menu, select
Other Files and Import Special Text. In the box that comes
up, put C1-C2 in the box labeled Store data in
column(s)... and hit OK. You now need to use the
window that comes up to locate the file on your Z-drive. It
will help to change the Files of type to .TXT.
Remember to give titles to the two columns. (Of course,
you could always try typing in all the numbers, but that is probably more
annoying!).
Making a Q-Q plot: A Q-Q plot (or normal probability plot) is
a graph of the data against a graph of "what the data would look like
if it were exactly normally distributed". In this section we will construct
a Q-Q plot for the fish lengths and see if they are approximately normally
distributed for this population. To see how it works, we will do it the long
way.
- The first step is to rank the fish according to their lengths. We
will store these ranks in column C3. To do this, select
Rank... under the Manip menu option. Put C2 in the
Rank data in: box, and C3 in the Store ranks in:
box. Then click OK.
- Next we need to change these ranks into percentiles. The
easiest way to do this is to divide the ranks by the number of
observations + 1. To do this, select Calculator... under
the Calc menu. Put C4 in the Store results in: window,
and put C3/91 in the Expression window. Then click OK.
- Now, the point of doing this is to figure out what values from a
normal curve the lengths would have if they were really from a normal
distribution. The function that tells us what value a normal curve has
for a given percentile is called the "Inverse cumulative probability".
Under the Calc menu, choose Probability Distributions and
Normal.... Choose Inverse cumulative probability, put
C4 in the Input column and C5 in the Optional storage.
Then hit OK.
- Finally, we can make the Q-Q Plot! In the Regression menu
under Stat, choose Fitted Line Plot. Choose Length for
the Predictor and the C5 column for Response and hit ok.
Remember to print out and hand in the finished Q-Q plot (along with a few
other plots you will be asked to make below).
Questions
- What causes some of the ranks to have a decimal part,
like 84.5?
- The method used above to find the percentile based
on the rank is one of many possibilities. Why would we want
to divide by the (# of obs. + 1) instead of (# of obs.)?
[Hint: Consider an odd number of observations, and the corresponding
percentiles for the one in the middle, and the two extreme ones.]
- Notice that it fits a line fairly well, except for the ends.
In particular, the points for the longer tuna don't seem to fit
it very well at all. Would the largest five tuna have to be longer or
shorter in order for the points in the Q-Q plot to fit the line better?
How can you tell from the graph?
- Try out your theory in question 3 by changing the lengths of the
last five tuna in order to make the Q-Q plot look better. Repeat the
steps above and print out the new Q-Q plot. Were you right?
- Use minitab to calculate the mean and standard deviation of
the lengths of the tuna (Remember to undo the changes you made in
question 4!). Use this to check how well the Empirical rule seems
to apply to this set of data.
- Construct a regression line for predicting the age of the fish
from the length of the fish. Comment on how well this method of
calculating the age of the fish seems to work. (It saves the fish
from having to be cut open for starters!)