A data.frame is just a list of vectors or matrices where each has the same number of rows (think of the vectors as columns). It really is just the same as data in a spreadsheet:
var1 | var2 | var3 | fac1 | fac2 |
---|---|---|---|---|
1 | 3 | 4 | hot | TX |
0 | 2 | 3 | cold | WI |
3 | 8 | 2 | cold | UT |
In fact, when one reads in this type of info, one gets a data.frame back:
df = read.csv(".csv", quote="") # <-reads an unquoted csv file with headers
So, assuming we have some data like the above, now lets make it in R:
> var1 <- c(1,0,3) # '<-' is the assignment operator (like a directional '=')
# 'c()' is the concatenate operator (makes lists of things)
> var2 <- c(3,2,8)
> var3 <- array( c(4,3,2), c(3) ) # here's how you make a more mathematical vector
> fac1 <- c("hot", "cold", "cold")
> fac2 <- c("TX","WI","UT")
# alright, lets add all this stuff together into a dataframe:
> df <- data.frame(var1,var2,var3,fac1,fac2, row.names=NULL, check.rows=TRUE)
# creates this:
var1 var2 var3 fac1 fac2
1 1 3 4 hot TX
2 0 2 3 cold WI
3 3 8 2 cold UT
# want to see the names of the columns in the dataset:
> names(df)
[1] "var1" "var2" "var3" "fac1" "fac2"
# change the names:
> names(df) <- c("height", "weight", "nose.length", "temp", "state")
> df
height weight nose.length temp state
1 1 3 4 hot TX
2 0 2 3 cold WI
3 3 8 2 cold UT
# so, now we have a data.frame, the preferred data structure in R. Let the magic begin:
> plot(df)
1 comment:
Thanks, it is helpful to know the why behind things.
Post a Comment