R - Data Frames
Updated at 2017-10-13 14:56
Data frames are like database table. They keep relevant data together. In R, they are highly similar to matrices, thus most things that work with matrices work with data frames. Most notable difference is that each column in data frames can have a different type of variables.
treasure <- data.frame(weights, prices, types)
print(treasure)
# weights prices types
# 1 300 9000 gold
# 2 200 5000 silver
# 3 100 12000 gems
# 4 250 7500 gold
# 5 150 18000 gems
Data frame columns have names which you can use to accessing the column.
treasure[[2]]
# [1] 9000 5000 12000 7500 18000
names(treasure)
# [1 ] weights prices types
treasure[["prices"]]
# [1] 9000 5000 12000 7500 18000
treasure$prices
# [1] 9000 5000 12000 7500 18000
mean(treasure$prices)
# [1] ?
You can filter rows by values in column.
high.value <- subset(treasure, prices > 8000)
print(high.value)
You can load files with the read
command.
# CSV file
read.csv("targets.csv")
# Port Population Worth
# 1 Cartagena 35000 10000
# 2 Porto Bello 49000 15000
# 3 Havana 140000 50000
# 4 Panama City 105000 35000
# Tab-delimited text file
read.delim("targets.txt")
# Port Population Worth
# 1 Cartagena 35000 10000
# 2 Porto Bello 49000 15000
# 3 Havana 140000 50000
# 4 Panama City 105000 35000
# Other way to bring tab-delimited files.
read.table("infantry.txt", sep="\t")
# Specify first row as headers.
read.table("infantry.txt", sep="\t", header=TRUE)
# Port Infantry
# 1 Porto Bello 700
# 2 Cartagena 500
# 3 Panama City 1500
# 4 Havana 2000
Combining data frames. You can either add them as columns cbind
or rows rbind
.
df <- data.frame(x = 1:3, y = c("a", "b", "c"), stringsAsFactors = FALSE)
cbind(df, data.frame(z = 3:1)) # Number of rows must match.
# x y z
# 1 1 a 3
# 2 2 b 2
# 3 3 c 1
rbind(df, data.frame(x = 10, y = "z")) # Columns names must match.
# x y
# 1 1 a
# 2 2 b
# 3 3 c
# 4 10 z
Merging data frames. By default searches columns with same name.
targets <- read.csv("targets.csv")
infantry <- read.table("infantry.txt", sep="\t", header=TRUE)
merge(x = targets, y = infantry)
# Port Population Worth Infantry
# 1 Cartagena 35000 10000 500
# 2 Havana 140000 50000 2000
# 3 Panama City 105000 35000 1500
# 4 Porto Bello 49000 15000 700
Rememer to convert time data to date object. Helps to create nice graphs and calculater time differneces.
targets <- read.csv("targets.csv")
targets$date <- as.Date(targets$date, "%d-%b-%y")
Sources
- Google Developers R Programming Videos
- Try R