This, to the best of my recollection, is an account of the Splus commands I used in presenting unequal probability sampling in class. In the first place, we had to import the Kershaw traffic data from Minitab. I saved the Minitab data as a text worksheet (traffic.txt), comma delimited, and kept the column names. I created an _Data subdirectory on my a: drive (under directory stat519) and saved the text worksheet as a data frame: attach("a:\\stat519\\_data",pos=1) f1994.f_read.table("a:\\stat519\\traffic.txt",header=T,sep=",", row.names=NULL) summary(f1994.f) After this, I wanted to simulate 10000 runs (I'd actually like to do more) sampling without replacement using the 1994 data as my size variable. In a new session, I would type: attach("a:\\stat519\\_data",pos=1) attach(f1994.f) pi.list_sim.pi(20,X1994.ADT,Location,10000) This assumes that you already have sim.pi saved as a Splus function in your _Data subdirectory. If you haven't done this but you have saved the text version sim.pi.txt from the home page, you can save it there by typing sim.pi_source("a:\\stat519\\sim.pi.txt") The variable pi.list generated by sim.pi is a list with 3 elements: a vector of labels, a vector of pi_i's and a matrix of pi_ij's (whose diagonal elements are the pi_i's). To compute a Horvitz-Thompson estimator and its variance, we need to generate a data set by sampling without replacement using X1994.ADT as our size variable. We then compute the estimator and its standard error: idcol_seq(1,202) px_X1994.ADT/sum(X1994.ADT) ix_sample(idcol,size=20,prob=px) muhat_sum(X1996.ADT[ix]/pi.list[[2]][ix])/202 That gives us the estimate of the mean--you might want to type: Location[ix] to see which intersections were selected. To compute the variance and standard error, we need to enter: ymat_X1996.ADT[ix]%o%X1996.ADT[ix] pimat_b[[2]][ix]%o%b[[2]][ix] vtau_sum((1/pimat-1/b[[3]][ix,ix])*ymat) sqrt(vtau/202**2) Incidentally, the symbol %o% computes an outer product of two vectors.