STAT 702 - Fall 2004 & STAT 703 - Spring 2005
R Templates
The Basics of R
STAT 702
    Ch. 1 #17: Plotting Acceptance Probabilities
    Ch. 1 #18 soln: Estimating Hypergeometric Sample Size
    Ch. 2 #67: Plotting the PDF
    Lecture 23: Expected Number of Tests Needed
    Lecture 23: Benefit of Stratifying
    Hmwk 16 #2: Monte Carlo Integration
STAT 703
    Lecture 3: MoM, QQ Plots, and Parametric Bootstrap
    Lecture 4: Maximum Likelihood Estimation
    Lecture 6: MLEs for Logistic-type Regression
    Lecture 8: How Accurate are the Approximate Confidence Intervals?
    Ch. 8 #6: Another Example of Finding MLEs Numerically
    Lecture 14: Likelihood Ratio for Discrete Variables
    Hmwk 4: Notes on LRT for Continuous Variables

The Basics of R:

R, like the commercial S-Plus, is based on the programming language S. It can be downloaded for from www.r-project.org [At home, after reading the instructions!, choose: CRAN, then a mirror site in the US, then Windows (95 and later), then base, then rw1071.exe]. This page also has links to FAQs and other information about R.

One of the nice things about R (besides the fact that it's powerful and free) is that you can save everything you've done after you've finished each time. If you are using a network machine where you don't have access to the C drive, you can manually load and save the .Rdata file on another dirve each time you want to start and end the program. You would, of course, have to save the .Rdata file on disk if you wanted to use it both at home and at school.

At the heart of R are the various objects that you enter. At any time (when you are in R) you can type in the function objects() to see which ones you have entered previously. Presumably there aren't any there right now. R also has many built in objects, most of which are functions. Because these objects are always there, it will not list them when you type objects().

A list of the various functions for R can be found in the manuals that are installed with the program. The various manuals can be found under the Manuals submenu of the Help menu. In particular, the R Reference Manual lists not only all the functions in the base package of R, but also some of those in the supplementary packages as well. Because this manual is nearly 1000 pages long, you probably shouldn't print out a copy!!!!

The various supplementary packages are simply groups of functions that others wrote and decided to make available to the public. A list of those that you downloaded can be found under the Packages menu by choosing Load Package. The one that we will use most often is the MASS package. It can be loaded in by typing library(MASS). Note that some of the non-standard libraries need to be downloaded and unzipped somewhere first if you don't have access to your C drive.

Once you know the name of a function, you can get help about it simply by typing help(functionname). In R, parentheses indicate either mathematical grouping, like they usually do, or that you are applying a function. Braces [ ] indicate that you are finding an element in a string.

This brief example shows some of R's flexibility.

x<-c(5,4,3,2)

The <- is the command that assigns what ever is on the left to the name that is on the right. The c indicates that an array or vector is going to be entered. Simply typing x now will return those four values. Typing

x[3]

will return the third value,

mean(x)

will return the mean, and

objects()

will show you that you now have an object called x.

R also has a variety of graphical functions and statistical functions built in.

hist(x)

will construct a histogram, and

t.test(x,alternative="greater",mu=5)

will test the null hypothesis mu=5 vs. the alternate that mu > 5. (If you happen to mistype a command and get an error, just hit the up error to bring it up again for editing).

Besides just getting the result of a function, we can store the results somewhere. For example

x.t<-t.test(x,alternative="greater",mu=5)

will store the result of the t-test in x.t.

attributes(x.t)

will then show you the various properties of x.t. Thus,

x.t$p.value

will return the p-value.

R can also be used to write your own functions. For example, if you wanted a function that returned sin(x)+x2 we could write:

weird<-function(x){
y<-sin(x)+x^2
return(y)}

Try weird(5) and weird(x). What is it doing?

One of the things we will be doing on some homework assignments is using R to generate random numbers following various discrete and continuous distributions, and seeing how they behave. Try:

ndat<-rnorm(200,5,10)
chidat<-rchisq(200,4)
par(mfrow=c(2,1))
hist(ndat)
hist(chidat)


Ch. 1 #17: Plotting Acceptance Probabilities

In R it is easy to construct a scatterplot of two sets of points against each other. Consider:

X   12   -12   18
Y   16    6   21
The vectors would be entered using the code:
x<-c(12,-12,18)
y<-c(16,6,21)
A basic plot would then be made using
plot(x,y)
A few options are setting the limites of the x and y axis (xlim and ylim), changing the names of the x and y axis (xlab and ylab), and connecting the points (type="l") assuming they are in order. The following command shows what can happen if they are not in order.
plot(x,y,xlim=c(-20,20),ylim=c(0,25),xlab="variable 1",ylab="variable 2",type="l")
To fix this we can use the order and sort commands to sort the x-variable and to put the y-variable in that same order.
x1<-sort(x)
y1<-y[order(x)]
plot(x1,y1,xlim=c(-20,20),ylim=c(0,25),xlab="variable 1",ylab="variable 2",type="l")
Now the homework assignment gives that the x values (the p's) are: 0, 0.05, 0.10, 0.15, 0.2, and 0.25. But now we need to find the y values!

Treating it as a binomial: The following code will calculate the probability we examined in class where we wanted to know the chance of seeing exactly 10 defectives out of 20, where the probability of a defective is 0.02.

choose(20,10)*(0.02)^10*(1-0.02)^(20-10)
This is the same as the built in function dbinom. dbinom must be given three values: the number of "successes", the population size, and the probability of a "success". So
dbinom(10,20,.02)
should return the same value. If we wanted to get this value for a bunch of different probabilities we could try something like:
dbinom(10,20,c(0,0.05,0.10,0.15,0.2,0.25))
or
ps<-c(0,0.05,0.10,0.15,0.2,0.25)
ps
dbinom(10,20,ps)
We could also get the probability of exactly 10 or 11 by adding two of these vectors together:
psfor10<-dbinom(10,20,c(0,0.05,0.10,0.15,0.2,0.25))
psfor11<-dbinom(11,20,c(0,0.05,0.10,0.15,0.2,0.25))
psfor10or11<-psfor10+psfor11
psfor10or11
or to get the probability that NOT exactly 10 or 11:
1-psfor10or11
And we could plot it...
plot(ps,1-psfor10or11,xlim=c(0,0.25),ylim=c(0,1),xlab="p",ylab="prob",type="l")
Not a very exciting plot for this example! Making ylim have a different range might make it look better.

Treating it as a Hypergeometric: It is also possible (and closer to the problems actual phrasing) to treat this problem as a hypergeometric instead of a binomial. As an example, consider the problem with the 0.02 defective rate. Imagine that there were originally 2000 parts of which 40 were defective (40/2000=0.02). The box of 20 we are looking at had 20 randomly chosen from that 1,000.

This would then be a hypergeometric with population size 2000, number of "successes" (defective in this case) 40, and number chosen 20. The probability of getting exactly 10 defectives out of the twenty would then be:

choose(40,10)*choose(2000-40,20-10)/choose(2000,20)
from the formula at the top of page 13 or the middle of page 40.

This is the same as the built in function dhyper. dhyper must be given three values: the number of "successes" chosen, the number of possible "successes" overall, the number of possible "failures" overall, and the total sample size we chose. So

dhyper(10,20,1980,40)
should return the same value. If we wanted to get this value for a bunch of different probabilities we just need to adjust them to be the number out of 2000 instead of just percents.
ps<-c(0,0.05,0.10,0.15,0.2,0.25)
numbad<-2000*ps
numgood<-2000-numbad
dhyper(10,numbad,numgood,20)
We could also get the probability of exactly 10 or 11 by adding two of these vectors together:
psfor10<-dhyper(10,numbad,numgood,20)
psfor11<-dhyper(11,numbad,numgood,20)
psfor10or11<-psfor10+psfor11
psfor10or11
or to get the probability that NOT exactly 10 or 11:
1-psfor10or11
And we could plot it...
plot(ps,1-psfor10or11,xlim=c(0,0.25),ylim=c(0,1),xlab="p",ylab="prob",type="l")
Not a very exciting plot for this example! Making ylim have a different range might make it look better.


Ch. 1 #18 soln: Estimating Hypergeometric Sample Sizes

As seen in class we reduce the problem to the formula: P[0 defective] = (n-m)(n-m-1)...(n-m-k+1)/n(n-1)...(n-k+1) < 0.1

One way of solving this would be to use brute force and just keep pluggin values into R and seeing what the largest m is that gives a probability of 0 defectives as close to 0.1 without going over. First we need to enter the function that gives the probabilities.

f<-function(m,n=1000,k=10){
  prod((n-m):(n-m-k+1))/prod(n:(n-k+1))
}

The first line says that we are defining an object f to be a function, and futher that the function takes three arguments m, n, and k. It also says that if no value of n is given that it should be taken to have a default of 1,000 and that k should have a default of 10. So using f(200,1000,10) would give the same result as f(200).

The second line says what the value of the function is, namely the product of (n-m) down to (n-m-k+1) divided by the product of n down to (n-k+1). The : indicates that you are defining a sequence of numbers.

You would normally like to be able to just say f(1:1000) to get all of the possible values of the function from m=1 to m=1000, but it doesn't work in this case. (The reason has something to do with putting a sequence of values inside a function that uses sequences). Instead the following code will apply the functin f to all the m values from 1 to 1,000 and display the results with the m in the first column and the probability in the second column. It also rounds off the values to four decimal places.

 
values<-cbind( m=1:1000 , prob=sapply(1:1000,f) )
round(values,4)

Using this we can see that m=204 gives p=.1009 and m=205 gives p=.0997, meaning that we need to have m of at least 205.

A second solution is to use an approximation. Assuming n is large then we can approximate what we are looking for with: (n-m)^k / n^k < 0.1. As we saw in class what we want is:

1000 - (.1*1000^10)^(1/10)
This gives that we want m to be at least 205.6718 which means we would need m=206. This approximation came out very close to the m=205 that we got above.


Ch. 2 #67: Plotting PDFs

The code below is from in class on September 23rd, where we plotted the pdf for a variety of gamma distributions.

par(mfrow=c(2,2))
t<-(0:2500)/100
ga_t<-dgamma(t,shape=1,rate=1)
gb_t<-dgamma(t,shape=2.5,rate=1)
gc_t<-dgamma(t,shape=5,rate=1)
gd_t<-dgamma(t,shape=10,rate=1)
plot(t,ga_t,xlim=c(0,25),type="l")
plot(t,gb_t,xlim=c(0,25),type="l")
plot(t,gc_t,xlim=c(0,25),type="l")
plot(t,gd_t,xlim=c(0,25),type="l")

t<-(0:2500)/100
ge_t<-dgamma(t,shape=2.5,rate=0.25)
gf_t<-dgamma(t,shape=2.5,rate=0.5)
gg_t<-dgamma(t,shape=2.5,rate=2)
gh_t<-dgamma(t,shape=2.5,rate=4)
plot(t,ge_t,xlim=c(0,25),type="l")
plot(t,gf_t,xlim=c(0,25),type="l")
plot(t,gg_t,xlim=c(0,25),type="l")
plot(t,gh_t,xlim=c(0,25),type="l")
The par(mfrow=c(2,2)) command tells it to make a 2x2 matrix of plots. We then set up the values to plot at (in this case 0 to 25 at increments of 0.01). The dgamma function is the function for the pdf of the gamma ( help(dgamma) tells you about the parameters), and then we plot them using lines ( type="l" ).

See help(dweibull)) for information on the Weibull


Lecture 23: Expected Number of Tests Needed

The following code is from Lecture 23 and is designed to calculate the expected number of tests for various numbers of groups (see Example C on page 121). n is the overall number of samples, m is the number of groups it is divided into, and p is the probability of any individual being "infected".

expntests<-function(n,m,p){
m*((1-p)^(n/m)+(n/m+1)*(1-(1-p)^(n/m)))
}

n<-5000
p<-.001
ms<-120:160

cbind(120:160,expntests(n=n,m=ms,p=p))

Lecture 23: Benefit of Stratifying

The following code is from Lecture 23 and is designed to show the benefit of stratified sampling over simple random sampling. Consider sampling from a population of size N with strata of size Ni. The population overall has proportion p of the desired trait, while each level of stratum has proportion pi. Compare the variance of an overall sample of size n to stratifyed samples of size ni from each stratum.

For example consider a population of size 150 with prevalence 0.5 that could be broken into strata of size 50 with prevalences 0.1, 0.5, and 0.9. Compare taking a sample of size 30 from the overall population to taking samples of size 10 from each strata.

N<-150
Ni<-c(50,50,50)
p<-0.5
pi<-c(0.1,0.5,0.9)
n<-30
ni<-c(10,10,10)

v<-(p*(1-p)*(N-n)/(N-1))/n
vstrat<-sum((Ni/N)^2*pi*(1-pi)*(Ni-ni)/(Ni-1)/ni)

v
vstrat
In this case we see that the variance using the stratified sampling (0.0039) is less than the variance using a simple random sample (0.0067). Changing the various values can show that the benefit disappears if the strata levels aren't actually different in terms of their prevalences.

Hmwk 16 #2: Monte Carlo Integration

If we wanted to find the value of the integral of x2 from -1 to 1, we would have to modify the algoirthm in Example A on page 164 a little. Notice that the formula at the bottom of page 164 is basically finding the average height of the region... but area is base times height! So to find the integral from -1 to 1 of x2 we need to multiply the value by two.
This code finds an estimate of the integral using 100 randomly selected points.

x<-runif(100,-1,1)
fofx<-x^2
2*mean(fofx)
This second piece of code plots one realization of the values of the estimate as the number of points increases. (It will take a minute to run).
x<-runif(20000,-1,1)
fofx<-x^2
ests<-rep(0,20000)
for (i in 1:20000){
  ests[i]<-2*mean(fofx[1:i])
}
plot(1:20000,ests,type="l",ylim=c(.65,.70))
lines(c(1,20000),c(2/3,2/3))

Lecture 3: MoM, QQ Plots, and Parametric Bootstrap

The following code is from class on January 18th. The first step was to load the data set into R, and to then pick off the first column. I called it pars as short for parametrs, and saved it as the vector a. I sorted as well to make it easier to look at. c is a reserved name, so you shouldn't call a vector that (maybe cc?).

pars<-read.table("http://www.stat.sc.edu/~habing/courses/703/itests.txt",head=TRUE)
a<-sort(pars[,1])
This code calculates the mean and estimated variance from the data. Notice that the formulas we are using divide by n and not n-1 so we need to adjust what R is calculating. I then calculated the method of moments estimators (I also had it find the number of observations since I would need that later)
n<-length(a)
xbar<-mean(a)
sigma2hat<-(n-1)/n*var(a)
lhat<-xbar/sigma2hat
ahat<-xbar^2/sigma2hat
The q-q plot is made using the qgamma (quantiles of the gamma) function. Other distributions have similarly named functions. Recall from last semester that are lambda is the rate parameter (=1/scale) in R. The line on the q-q plot is drawn using the quantiles of the gamma.
plot(sort(a),qgamma((1:n)/(n+1),shape=ahat,rate=lhat))
lines(qgamma((1:n)/(n+1),shape=ahat,rate=lhat),qgamma((1:n)/(n+1),shape=ahat,rate=lhat))
Finally, here is the code for the estimating 100,000 bootstrap samples for this example. Try summary(lhat.dist) or sd(lhat.dist) or hist(lhat.dist) to see the properties of lambdas sampling distribution.
nsamples<-100000
x<-rgamma(n*nsamples,shape=ahat,rate=lhat)
x<-matrix(x,ncol=n)
xbar.dist<-apply(x,1,mean)
s2h.dist<-(n-1)/n*apply(x,1,var) 
lhat.dist<-xbar.dist/s2h.dist
ahat.dist<-xbar.dist^2/s2h.dist

Lecture 4: Maximum Likelihood Estimation

The code below uses R to find the maximum likelihood estimates for the example above. The idea is to use the built in function for finding extrema called optim that finds minimums, and to enter in the log-likelihood function manually. Of course we want the maximum, and not the minimum so we need to have ngamloglike be the negative of the log-likelihood.

optim is set up so that the first argument pars is what we want estimated, and the second argument data is what we know. You should be able to check that the following code contains the function for the negative log-likelihood from page 256, and then asks R to use the optim, starting with 6 for the initial a guess, and 6 for the initial l guess.

ngamloglike<-function(pars,data){
 a<-pars[1] 
 l<-pars[2] 
 n<-length(data) 
 -1*
 (n*a*log(l)+(a-1)*sum(log(data))- 
  l*sum(data)-n*lgamma(a)) 
  } 

optim(c(6,6),ngamloglike,data=a)
Notice that the estimates are alpha-hat=7.338857 and lambda-hat=6.854916.

We can also apply the bootstrap to these estimates to see if it appears we really have recovered something near the maximum of the sampling distribution.

amle<-7.338857
lmle<-6.854916
nsamples<-100000
x<-rgamma(n*nsamples,shape=amle,rate=lmle)
x<-matrix(x,ncol=n)
xbar.dist<-apply(x,1,mean)
s2h.dist<-(n-1)/n*apply(x,1,var) 
lmle.dist<-xbar.dist/s2h.dist
amle.dist<-xbar.dist^2/s2h.dist

hist(lmle.dist,nclass=50,xlim=c(0,25))
lines(c(mean(lmle.dist),mean(lmle.dist)),c(-1000,30000),lwd=4,lty=5)
text(14,8000,"Mean of Estimates")
lines(c(lmle,lmle),c(-1000,30000),lwd=4)
text(3,8000,"True Value")

Lecture 6: MLEs for Logistic-type Regression

The following code reads in the data concerning the temperature and the failure of the space shuttle o-rings before the Challenger disaster. A 1 indicates it worked successfully and a 0 indicates a failure.

oring<-read.table("http://www.stat.sc.edu/~habing/courses/703/oring.txt",head=TRUE)
oring<-as.matrix(oring)
The function lr.nlgolik contains the negative log-likelihood for standard logistic regression. Notice that the plogis function is set up slightly differently than what we used in class. We used a+bx while the function uses -m/s + x/s . The optim command is then used with initial values a=0 and b=0.1. This can be compared to the results gotten using the built in glm function. The general formaty is glm(y~x,modeltype) where we are using x to predict y and modeltype is appropriate for the kind of model we are fitting (binomial in this case).
lr.nloglik<-function(pars,data){
   a<-pars[1]
   b<-pars[2]
   s<-1/b
   m<--a*s
   x<-data[,1]
   y<-data[,2]
   p<-plogis(x,m,s)
  -sum(y*log(p)+(1-y)*log(1-p))
}

optim(c(0,.1),lr.nloglik,data=oring)


glm(oring[,2]~oring[,1],binomial)
The code to estimate the values if a lower or upper asymptote are included is much the same. Here is the code for including an upper asymptote (there is some chance of failure no matter what the temperature = the chance of success never gets to 100%). The difficulty here is that the procedures that R is using to estimate the values are very sensitive to the exact options and starting values chosen.
lrwup.nloglik<-function(pars,data){
   a<-pars[1]
   b<-pars[2]
   up<-pars[3]
   s<-1/b
   m<--a*s
   x<-data[,1]
   y<-data[,2]
   p<-(1-up)*plogis(x,m,s)
  -sum(y*log(p)+(1-y)*log(1-p))
}

#Using a Newton like method with the upper parameter bounded between 0 and 1. #

optim(c(-6,.1,.5),lrwup.nloglik,method="L-BFGS-B",lower=c(-Inf,-Inf,0),upper=c(Inf,Inf,1),data=oring)

#The bounded Newton type method is sometimes a bit inaccurate due to #   
# the procedure for keeping the bounds.  Here the non-bounded procedure #
# is used, the fnscale and reltol options should make the estimation #
# take longer to converge (be more accurate), and maxit lets it take #
# more time if needed.  The starting values are what were obtained #
# from the bounded method. #

optim(c(-5.99,1.049,.25),lrwup.nloglik,control=list(fnscale=10000000,maxit=10000,reltol=1e-9),method="BFGS",data=oring)

#Notice what happens when a different starting value is used! #
# In both cases the estimate didn't move very far.  You can #
# tell the second is better because the negative log-likelihood #
# is smaller (so the log-likelihood is bigger. #
#Making optimization methods actually give the "best-value" can #
#  take lots of practice, so you would want a statistical consultant #
#  to help you with this! #

optim(c(-20.9391620,0.3829271,0.1828402),lrwup.nloglik,control=list(fnscale=10000000,maxit=10000,reltol=1e-9),method="BFGS",data=oring)

Lecture 8: How Accurate are the Approximate Confidence Intervals?

Consider the example of finding the MLE for the paramater lambda from a Poisson distribution. We found that the formula for the (1-alpha)*100% CI was x-bar +/- z(alpha/2) * sqrt(xbar/n) . A function to find the ci for a set of data (with a default of 95%) would be:

poisci<-function(data,alpha=0.05){
  xbar<-mean(data)
  n<-length(data)
  zaover2<-qnorm(1-alpha/2)
  ci<-xbar+c(-1,+1)*zaover2*sqrt(xbar/n)
  ci
}
To generate a data set of 60 observations with lambda=5, and then find the CI, we could then use:
pdata<-rpois(60,5)
poisci(pdata)
The following code will perform a simulation study to see how accurate this approximate CI is.
samplesize<-60
lambda<-5
numbersimulations<-10000

#Generate a matrix with each row being a different simulated data set.#

datamatrix<-matrix(rpois(samplesize*numbersimulations,lambda),ncol=samplesize)

#Apply the confidence interval function to each row of the data...#   
# and then take the transpose so that each CI is on its own row.#

cis<-t(apply(datamatrix,1,poisci))

#Count what percent of the time it worked!#

sum((cis[,1] < lambda) & (cis[,2] > lambda))

Notice that it covered about 95% of the time. To see how it worked for different choices of lambda or the samplesize we could simply rerun the code with a different value of lambda or samplesize.

Ch. 8 #6: Another Example of Finding MLEs Numerically

Once you have the answer to 5b (the formula for the MLE) it is straightforward to find the estimate of p using the data in problem 6. We could also have just used the formula for the negative log-likelihood and R's built in optimization functions. Recall that the log-likelihood is n*log(p)+sum(xi-1)*log(1-p). So first we enter the data and make the xis (notice the use of rep to get the list of all of the xis), then we enter the negative log-likelihood function, and finally we use optim. R gives a warning for just one parameter, so use the option method="BFGS".

nhops<-c(1,2,3,4,5,6,7,8,9,10,11,12)
fr<-c(48,31,20,9,6,5,4,2,1,1,2,1)
xi<-rep(nhops,fr)

geomnloglik<-function(p,vals){
  n<-length(vals)
  -(n*log(p)+sum(vals-1)*log(1-p))
}

optim(.5,geomnloglik,method="BFGS",vals=xi)
Notice that the name vals is the name used inside the function, and appears on the left-hand side of the = when we call optim. xi is the name of the data, and it is on the right-hand side.

Lecture 14: Likelihood Ratio for Discrete Variables

Following the math through we arrived at the relationship P[sum(Xi) > or = c]=alpha where the sum(Xi) is Poisson with lamda=12.

Because the Poisson distribution is discrete, we can't just pick any alpha and get the c that goes with it. Instead we have to find what alpha's go with the different c's. (See example A on page 9.2 for example).

To do this we are going to use the cdf function ppois. So alpha=P[sum(Xi) > or = c] = 1 - P[sum(xi) < or = c-1] = 1-F(c-1). (Drawing the picture can help you see why it changes from c to c-1.) To get the table we could use the following code.

cvalues<-1:20
cbind(cvalues,1-ppois(cvalues-1,12))
If we chose c=19 we would have an alpha level of approximately 0.037, and if we chose c=18 then alpha would be approximately 0.063. If we needed a level 0.05 test then we would have to make the rejection region be sum(Xi) > or = 19 and the test would be conservative (would not reject as much as we would like).

The binomial works similarly. Notice that the table in Example A on page 301 just gives the cdfs, and does not give the values that they actually want. The table would be found using

cvalues<-0:10
cbind(cvalues,pbinom(cvalues,size=10,prob=.7),pbinom(cvalues,size=10,prob=.6),
pbinom(cvalues,size=10,prob=.5))


Hmwk 4: Notes on LRT for Continuous Variables

If you are working with a continuous distribution, there is no need to try out various values and see what the alpha is. Instead we can directly use the cdf.

For example, say you have: P[sum(Xi) < or = c]=0.05, where sum(Xi) is chi-squared with df=3. Then you would simply use: qchisq(0.05,df=3) and get 0.3518463. This is because the q functions give you the values that have the given percent below them (e.g. qnorm(0.5,mean=0,sd=1)=0 and qnorm(0.975,mean=0,sd=1)=1.96).

If it was P[sum(Xi)= or >c]=0.05 you need to notice that that is the same as 1-P[sum(Xi)< c]=0.05 which would be P[sum(xi)< c]=0.95. So we would just use: qchisq(0.95,df=3) and get 7.814728.