Some more S-Plus notes ---------------------- covered in sections 8-10 feb Seeing/ removing Variables you have created ------------------------------------------ objects() - lists the variables you have created. rm() - remove objects you have created eg take the data frame "data" from last time to remove it would type >rm(data) if you now typed > data you would get an "object not found" message attach() - attaches the columns of a data frame as variables which you can the use. eg > attach(data) now you can refer to name rather than data$name etc... detach() - detach an attached dataframe Functions --------- Functions allow you to encapsulate a set of related data/function calls into a single command. this is useful if you are going to be performing some task repetively. You can create a function in Splus by using an expression of the form fname <-function(arg1,arg2,....){ #code block to be executed } where fname is the name by which you will refer to the function, arg1, arg2, etc are the names you will use to refer to variables within the function and the {} enclose the code to be exectuded by the function. you can then call the function by fname(val1, val2, ...) where val1, val2 etc are the values to be taken by the function on that execution. The value that is returned by the function is the value of the final statement. here is an example function to carry out a two-sample t-test ttest <- function(X1,X2,test="two-sided",alpha=.05){ # note that it is intended that X1 and X2 be two vectors each # holding one sample # the use of "=" is to specify a default argument n1 <- length(X1) n2 <- length(X2) ndf <- n1 + n2 - 2 s2 <- ((n1-1)*var(X1) + (n2 -1)*var(X2))/ndf # working out pooled variance tstat <- (mean(X1) - mean(X2))/sqrt(s2*(1/n1+1/n2)) pvalue <- switch(test, "two-sided"=2*(1-pt(abs(tstat),ndf)), "upper" = 1 - pt(tstat,ndf), "lower" = pt(tstat,ndf), NULL) list(tstat = tstat,df=ndf,reject= pvalue < alpha, pval=pvalue) } this function will return a list containg the t statistic, the degrees of freedom, whether the test is accepted or rejected (at level alpha) and the pvalue. eg suppose X and Y are two vectors containing samples to be tested then the following might be a result of running the function. > ttest(X,Y) $tstat 2.4223 $df 24 $reject T $pval 0.0153 note that by not specifying either the "test" or "alpha" parameters in the function call the default values are used. > ttest(X,Y,test="upper",alpha=0.01) would do an upper sided test of level 0.01 etc. Control Structures ------------------ if (condition){ # code block executed if condition is true } else { # code block exectued if condition is false } the if else structure allows you to test a condition and then execute one of two different options depending on the result of the test eg to get the minimum of two numbers printed if (i < j) print(i) else print (j) switch(variable, option1 = return1 ,option2 = return2 ,...,optionn= returnn ,default) the switch function allows you to test the variable to see if its value is one of the options in which case it returns the value specifed for that option. The last position is the default value. loops ------ Try to avoid loops if possible because Splus is an interpreted language. things inside a loop must be repetively interpreted and this can slow down operations. for (variable in sequence){ # code blcok } the for loop index through the values of varaible as given by sequence (use the : operator to generate a sequence) eg for (i in 0:9){ print(i^2) } while(condition){ #code block } while loop tests conditon at start of each loop. if condition fails then loop exits eg i <- 0 while (i <10){ print(i^2) i <- i +1 } repeat { # code block } repeat will repeat the code block indefinitely (or until a break function is called eg i<-0 repeat{ print(i^2) i <- i+1 if (i > 9){ break } } Graphical functions ------------------- two different ways of producing graphics in Splus - standard S graphics - trellis graphics - tries to bring a more constient theme to graphic output - you may use either or both First you need to open a graphics device of some type. Standard to use motif() or openlook() to open a graphics window you amy also use postscript("filename") to write you output directly to a file rather than to a plotting window. main plotting function is plot() which will generate a plot eg plot(X,Y) will generate a scatter plot of Y vs X we may introduce further parameters into the plot function for example plot(X,Y,xlab="Label for X axis",ylab="Lable for Y axis",sub="title below plot",main="Title above plot") other parameters include (by all means not exhaustive xlim=c(lo,hi) to specify a range for x axis ylim=c(lo,hi) to specify a range for y axis pch= to specify the protting character eg *,triange, circle, square .... can add an additional line/curve to a plot by lines() function eg suppose A and B are vectors of equal lenght and A contains the x component and B the y component of a series of points. then lines(A,B) would join the points (a1,b1) to (a2,b2) to (a3,b3) ..... with line segments (and superimpose this on to the plot you have already created. With a fine enough grid this will look like a curve on your plot. points() - does similar thing but just adds the points to the plot rather than joining with line segments. abline(a,b) will add the line y=a+bx to the plot title("some title here") - add a title to plot legend() - crete a legend for a complex plot hist() - histogram boxplot() - boxplot bwplot() - trellis graphics boxplot qqnorm() - normal proability plot qqline() - add line to qqnorm plot par(mfrow=c(a,b)) - partition graphics device into a a by b grid and fill with graphes by row. this means the firt plot() (or hist() or ...) comand i do will go in firt square of grid and second command will go in second square ..... etc