Basic Commands

  • Use the function c() to combine values into a vector.
  • Get the length of vectors(including lists) and factors by using length().
    x <- c(1,3,2,5)
    length(x)
    
  • The ls() function allows us to look at a list of all of the objects, such as data and functions, that we have saved so far.
  • The rm() function can by used to delete any that we don't want, also possible to remove all objects at once.

    ls()
    rm(x,y)
    rm(list=ls())
    
  • The matrix() function ca be used to create a matrix of numbers. We focus on the first three inputs: the data, the number of rows, the number of column.

  • Alternatively, the byrow=TRUE option can be used to populate the matrix in order of the rows.
  • The sqrt() function returns the square root of each element of a vector or matrix; The command x^2 raises each element of x to the power 2; any powers are possible, including fractional or negative powers.

    x = matrix(data=c(1,2,3,4), nrow=2, ncol=2)
    
  • The rnorm() function generates a vector of random normal variables, with first argument the sample size.

  • The cor() function to compute the correlation between .

    x = rnorm(50)
    y = x + rnorm(50, mean=50, sd=.1)
    cor(x, y)
    
  • The mean() and var() functions can be used to compute the mean and variance of a vector of numbers

  • Applying sqrt() to the output of var() will give the standard deviation. Or we can simply use the sd() function.
    set.seed (3)
    y=rnorm(100)
    mean(y)
    var(y)
    sqrt(var(y))
    sd(y)
    

Graphics

  • The plot() function is the primary way to plot data in R.
  • To create a pdf, we use the pdf() function, and to create a jpeg, we use the jpeg() function.
  • The function dev.off() indicates to R that we are done creating the plot. ~~~ x=rnorm(100) y=rnorm(100) plot(x,y) plot(x,y,xlab="this is the x-axis",ylab="this is the y-axis", main="Plot of X vs Y")

pdf("Figure.pdf") plot(x,y,col="green") dev.off()


* The **seq()** function can be used to create a sequence of numbers.

x=seq(1,10) x=seq(-pi,pi,length=50)


* The **contour()** function produces a contour plot in order to represent three-dimensional data; it is like a topographical map.

contour (x,y,f) contour(x,y,f,nlevels=45,add=T) fa=(f-t(f))/2 contour(x,y,fa,nlevels=15)


* The **image()** function works the same way as **contour()**, except that it produces a color-coded plot whose colors depend on the z value.
* sometimes used to plot temperature in weather forecasts.
* Alternatively, **persp()** can be used to produce a three-dimensional plot. The arguments **theta** and **phi** control the angles at which the plot is viewed.


## Indexing Data

A=matrix(1:16,4,4) A[2,3] A[c(1,3), c(2,4)] A[1:3, 2:4] A[1:2,] A[,1:2]

The use of a negative sign - in the index tells R to keep all rows or columns except those indicated in the index.

A[-c(1,3),]

The **dim()** function outputs the number of rows followed by the number of columns of a given matrix

dim(A)


## Loading Data
We can use the function **write.table()** to export data.Once the data has been loaded, the **fix()** function can be used to view it in a spreadsheet like window.

Auto = read.table("Auto.data") fix(Auto)


The data set also includes a number of missing observations, indicated by a question mark ?. Missing values are a common occurrence in real data sets.

Using the option **header=T(or header=TRUE)** in the **read.table()** function tells R that the first line of the file contains the variable names, and using the option **na.strings** tells R that any time it sees a particular character or set of characters, it should be treated as a missing element of the data matrix.

Auto = read.table("Auto.data", header=T, na.strings="?") fix(Auto) dim(Auto)

397 9

There are various ways to deal with the missing data, we choose to use the **na.omit()** function to simply remove these rows.

Auto=na.omit(Auto) dim(Auto)

392 9

Once the data are loaded correctly, we can use **names()** to check the variable names

names(Auto)


## Additional Graphical and Numerical Summaries
The **as.factor()** function converts quantitative variables into qualitative variables.

attach(Auto) plot(cylinders, mpg)

cylinders = as.factor(cylinders)


If the variable plotted on the x-axis is categorial, then *boxplots* will automatically be produced by the *plot()* function.

plot(cylinders, mpg) plot(cylinders, mpg, varwidth=T)


The **hist()** function can be used to plot a histogram.

hist(mpg)


The **pairs()** function creates a scatterplot matrix.

pairs(Auto) pairs(~ mpg + displacement + horsepower + weight + acceleration, Auto)


In conjunction with the **plot()** function, **identify()** provides a useful interactive method for identifying the value for a particular variable for points on a plot.

plot(horsepower, mpg) identify(horsepower, mpg, name) ~~~

The summary() function produces a numerical summary of each variable in a particular data set.

results matching ""

    No results matching ""