[1] "numeric"
4 Data Structures
4.1 Objectives
- Learn how to construct vectors, data frames, matrices, arrays, and lists
- Learn how each data structure relates to the others
- Subset and modify objects
- Know when to use each structure
4.2 Vectors
Vectors are the basic building block of R.
They contain elements of the same type.
4.2.1 Subsetting
Use brackets [].
x <- c("a", "b", "c", "d")
x[1] # first element[1] "a"
x[2:4] # range[1] "b" "c" "d"
x[c(1,4)] # multiple positions[1] "a" "d"
Logical subsetting:
y <- 1:5
y[y > 3][1] 4 5
Replace values:
y[y > 3] <- 0
y[1] 1 2 3 0 0
4.2.2 Factors
Factors represent categorical data.
4.3 Data Frames
A data frame is a 2D structure (rows × columns).
Each column is a vector.
Columns may have different types.
df <- data.frame(
name = c("Ana", "Ben", "Chris"),
age = c(20, 21, 19),
passed = c(TRUE, TRUE, FALSE)
)
str(df)'data.frame': 3 obs. of 3 variables:
$ name : chr "Ana" "Ben" "Chris"
$ age : num 20 21 19
$ passed: logi TRUE TRUE FALSE
4.3.1 Loading and Saving
Read CSV:
df <- read.csv("data.csv")Write CSV:
write.csv(df, "output.csv", row.names = FALSE)4.3.2 Subsetting and Combining
Extract column:
df$age[1] 20 21 19
df[, "age"][1] 20 21 19
Extract rows:
df[df$age > 19, ] name age passed
1 Ana 20 TRUE
2 Ben 21 TRUE
Add column:
df$status <- df$age > 20Combine:
rbind(df, df) name age passed status
1 Ana 20 TRUE FALSE
2 Ben 21 TRUE TRUE
3 Chris 19 FALSE FALSE
4 Ana 20 TRUE FALSE
5 Ben 21 TRUE TRUE
6 Chris 19 FALSE FALSE
cbind(df, new_col = 1:3) name age passed status new_col
1 Ana 20 TRUE FALSE 1
2 Ben 21 TRUE TRUE 2
3 Chris 19 FALSE FALSE 3
4.4 Matrices and Arrays
A matrix is a vector with dimensions.
All elements must be the same type.
m <- matrix(1:12, nrow = 3, ncol = 4)
m[2,3] # row 2, column 3[1] 8
m[2, ] # row 2[1] 2 5 8 11
m[, 4] # column 4[1] 10 11 12
Matrix operations:
t(m) # transpose [,1] [,2] [,3]
[1,] 1 2 3
[2,] 4 5 6
[3,] 7 8 9
[4,] 10 11 12
m * m # element-wise [,1] [,2] [,3] [,4]
[1,] 1 16 49 100
[2,] 4 25 64 121
[3,] 9 36 81 144
[,1] [,2] [,3]
[1,] 166 188 210
[2,] 188 214 240
[3,] 210 240 270
An array is a multi-dimensional matrix:
4.5 Lists
A list can store different types and sizes of objects.
$numbers
[1] 1 2 3
$name
[1] "Ana"
$matrix
[,1] [,2]
[1,] 1 3
[2,] 2 4
$df
name age passed status
1 Ana 20 TRUE FALSE
2 Ben 21 TRUE TRUE
3 Chris 19 FALSE FALSE
Subsetting lists:
my_list[1] # returns sub-list$numbers
[1] 1 2 3
my_list[[1]] # returns element[1] 1 2 3
my_list$name[1] "Ana"
4.6 Summary: When to Use What
| Structure | Same Type? | Dimensions | Use Case |
|---|---|---|---|
| Vector | Yes | 1D | Single variable |
| Factor | Yes | 1D | Categories |
| Data Frame | No (by column) | 2D | Tabular data |
| Matrix | Yes | 2D | Math operations |
| Array | Yes | 3D+ | Multidimensional data |
| List | No | Any | Complex objects |
Core idea: Everything in R builds from vectors.