Introduction to Splus for Windows





This introduction will cover the following topics:


Background Information on Splus

Splus is an object-oriented programming and statistical analysis language developed primarily at AT&T research labs during the 1980's. Data in Splus is aggregated into "objects", which are of different types. The most common object types are:

There are many other kinds of objects, including time series objects, array objects, and factor objects.

Most work in Splus is accomplished by functions, which typically create new objects from existing ones. What makes Splus "object-oriented" is that a function will behave differently depending on what kind of object it is acting on. For example, the assignment

	lx <- log10(x)

takes the existing object x, applies the logarithm base 10 function, and stores it as the object lx. The assignment symbol "<-" is literally the "<" key followed by the "-" key. If x is a numeric vector with all positive values, lx is a numeric vector of the same length, with values given by the logs base 10 of the corresponding x values. If x is a numeric matrix with all positive values, lx is a numeric matrix of the same dimensions, with values given by the logs base 10 of the x values. If there is an existing object named lx in the workspace, it will be replaced as a result of this calculation.

Object-oriented programming can be very powerful, but you need to keep track of the types of objects you're working on and creating, or errors will occur which may go undetected and be very hard track down later.

Splus is case sensitive so be careful about what you capitalize when you create names. For example, Lx, LX, and lx are different objects. We recommend you use lower case letters for most purposes. Be careful to write Splus functions exactly as they are shown in the documentation or help files. For example, T is the logical value for true, and t is the transpose-a-matrix function. Names in Splus can be of essentially any length, but should start with a character and not contain any blanks. Often, periods are used in place of blanks. We recommend you avoid choosing names that might also be natural choices for function names (e.g., "plot").

(Back to the top.)


Preparing your diskette for Splus

Your objects will be stored in a directory called _Data on your diskette. You should create this directory before you start Splus, as follows:

What we have done above is created a directory called S_intro, with a subdirectory called _Data. In the future, you will probably want to create new _Data subdirectories under different directories for different purposes, so your objects don't get all mixed up. For example, if using Splus while taking STAT 999, you may want to create a new directory Stat999, with a new _Data subdirectory to hold Splus objects relevant to Stat999 only.

(Back to the top.)


Starting Splus

Start Splus by double-clicking on the Splus icon in the Splus window on the main screen. The command window should open and you'll see the Splus prompt:

	>

There will probably be some warnings now and later about audit files, write permissions, etc., which you can ignore.

Your first job every time you start is to tell Splus where to store its objects:

	> attach ("a:\\S_intro\\_Data" , pos=1)

Of course, you should substitute another directory name for "S_intro", if appropriate. Be careful to type the command exactly as shown, including quotes and double back-slashes (unfortunately, the DOS single backslash is an escape character in Splus). To check the effects, enter

	> search()

This function displays the Splus search path. You should see "a:\\S_intro\\_Data" in the first position. When an object is referenced, Splus will look through each directory in the search path, in turn, to find it. New objects are written to the directory occupying position 1 (the "workspace"). If you forget to attach the _Data directory on your diskette in position 1, Splus will write the objects somewhere else, and you may have trouble finding them later! In fact, you may lose them completely, so be careful.

(Back to the top.)


Getting Data into Splus

There are several ways to get data into Splus. You can hand-type vectors using the combine function c():

	> newvec <- c(1,2,3,4,5,6,7,8)

Try this now. To display the object newvec, or any other object, simply type its name:

	> newvec

It is also easy to create vectors of certain standard types; for example, we could have created newvec by

	> newvec <- 1:8

The seq() function is another, more general tool, to generate vectors of values in ordered sequence.

Existing vectors can be combined into a matrix by columns or rows using the cbind() or rbind() functions, or can be reshaped using the matrix() function, e.g.:

	newmat <- matrix(newvec,2,4,byrow=T)

creates a matrix with 2 rows and 4 columns, by writing one row at a time from newvec (try it now and look at it). The matrix() function can also create constant matrices of any specified size, identity matrices, etc.

A hot tip: You will often make typing errors when typing Splus commands. If you push the up-arrow key once, the previous command will be displayed on the command line. You can use the left- or right-arrow keys to navigate in the command text, and repair it by deleting and/or typing. Push the up arrow key several times and you can recover still earlier commands.

Creating data frames from existing text files: The User's Guide goes into great detail on most data management tasks, but it skimps on this major topic. To do this, we recommend you do the following: first, create the text (ASCII) file with Minitab:

Table 1. Body weights (kg) and brain weights (g) for 15 terrestrial mammal species

species bodywt brainwt
afeleph 6654.00 5712.00
cow 465.00 423.00
donkey 187.00 419.00
man 62.00 1320.00
graywolf 36.33 119.50
redfox 4.24 50.40
narmadillo 3.50 10.80
echidna 3.00 25.00
phalanger 1.62 11.40
guineapig 1.04 5.50
eurhedghog .79 3.50
chinchilla .43 64.00
ghamster .12 1.00
snmole .06 1.00
lbbat .01 .25

Now that the text file is prepared, quit Minitab and restore the Splus command window to its working size. To create a data frame called brainbod, enter

> brainbod <- read.table("a:\\examples\\brainbod.dat" , header=T,
na.string=c("*"))

The first argument is the text file specification in quotes, using double backslashes in place of single ones. The other arguments tell Splus that the first row of the data set contains the variable names, and what symbol denotes a missing value. After executing the above statement, check the data frame by typing its name at the Splus prompt. Notice that, using the header=T option, Splus has taken the first character variable with unique values in every row (the species variable) as the row names.

Now check the variable bodywt by typing its name at the prompt:

	> bodywt

Splus will tell you that it can't find this object. Now type

	> brainbod$bodywt

and this variable's values will be displayed. When working with a data frame, you will have to reference variables by awkward two-level names like "brainbod$bodywt", unless you attach the data frame:

	> attach(brainbod)

After doing this, if you check the search path again, you will see "brainbod" in the second position. Now, Splus will be able to find the variable "bodywt". Try it now:

	> bodywt

Be careful, though, because if you create a new variable, it will not automatically be incorporated into the data frame. For example,

	lbody <- log10(bodywt)

creates a vector lbody in the workspace, but it is separate from the data frame brainbod. If you want to add it to the data frame, you should enter

	brainbod$lbody <- log10(bodywt)

(Back to the top.)


Introductory Splus graphics

One of the strengths of Splus is its graphical capabilities. To create a plot, open a graphics window:

	> win.graph()

You can open as many of these as you need. You may want to tile the windows so you can see them all. Make a plot of brainwt versus bodywt:

	> plot(bodywt,brainwt)

Note that the order of variables in plot specifications is always alphabetical: x-axis variable, y-axis variable, z-axis variable (if any).

There is not much to see on the above plot, since the elephant's point at upper right is so big it crowds the other points into the lower left corner. This plot will show more pattern on a log scale. To do this, we could create new variables--e.g., lbody<-log10(bodywt)--and plot these, or we can just nest the log-transform commands into the plot statement:

	> plot( log10(bodywt), log10(brainwt) )

Doing it this way will not create the logged variables for further use, but also does not clutter up the workspace. Which you use is up to you.

Now the plot is more interesting. We see a fairly good linear relationship between the logarithms of brain weight and body weight, with perhaps two unusual points. Which species are these? We can find out by using one of Splus' interactive graphic functions, identify(). To do this,

	> species <- dimnames(brainbod)[[1]]
	> identify( log10(bodywt), log10(brainwt), species )

Now, on the plot, position the crosshairs on a point you wish to identify, and click the left mouse button. A label appears, which is this point's value of species, the third specified variable in the identify() statement. Identify as many points as you like in this way. When finished, click the right mouse button.

By way of explanation for the first statement above, the row names of a data frame can be referenced as a variable "dimnames(dataframename)[[1]]". This is awkward, so above we create a new object (a vector) with a better name, "species". The column names are dimnames(dataframename)[[2]]. Isn't that just what you'd guess?

You can print your graph by choosing Print from the File menu. Also, you can save it on the clipboard by choosing Copy from the Edit menu, and copy it into the paint application for further manipulation. Splus has a tremendous capacity for both static and interactive graphics; see Chapters 5-8 of the User's Manual, Vol. 1.

(Back to the top.)


Getting Help in Splus

There are two main ways to get help in Splus. If you know the name of the function you would like help with, e.g., the seq() function mentioned earlier, type

	> help(seq)

and a documentation window will open. Try this now. If you need a hard copy, select Print Topic from the File window. When finished with the help window, close it in the usual way. Many of the help files are contained in the Splus Reference Manuals, Vol. 1 and 2, which are also available by ID trade, and are permanently mounted on the manual rack.

If you are not sure of the name of the function you need help with, simply type

	> help()

and a Table of Contents will open. Try this now. Double-click on a topic to get a list of its subtopics, and keep hunting until you find what you need.

(Back to the top.)


Quitting Splus

Before quitting, it is usually a good idea to clean up your workspace, to keep it from being cluttered up with objects you'll not use again. To view the names of the objects in the workspace, enter

	> objects()

To remove unwanted objects, such as the items newvec and newmat, use the rm() function:

	> rm(newvec,newmat)

A good rule of thumb is to remove or rename any objects which you are unlikely to use again or remember next time you open Splus in this directory. To rename an object, assign it to its new name, and then remove the old object:

	> new.name <- old.name
	> rm(old.name)

When the workspace is in good order, quit Splus by closing the command window, or entering the quit command,

	> q()

(Back to the top.)


Miscellaneous Advice

(Back to the top.)