3  Data Types

3.1 Objectives

  • Learn about the basic data types, including numeric, logical, and character
  • Learn how to identify the data type of an object
  • Learn how to change the data type of an object

3.2 Identification

To identify the data type of an object, you can use the class() function. Let’s use the class() function on the x variable we created in the last section:

x <- 2 # reassign your variable if you need to
class(x) # call the class() function on x
[1] "numeric"

As you can see, our class() function showed us that x, which holds a 2, is a of the numeric data type. We will use this class() function throughout this section to test our understanding.

3.3 Numbers

There are 3 types of number data types in R, but you will usually just use the numeric type.

3.3.1 The numeric data type

The numeric data type can represent any number (negative or positive), with any amount of decimals. There are a lot of ways to manipulate these numbers in R. For example:

[1] "numeric"
class(3.14)
[1] "numeric"
class(-0.4621473)
[1] "numeric"

When dealing with decimals, you can round numbers or floor them using the round() and floor() functions.

round(3.56)
[1] 4
floor(3.56)
[1] 3

3.3.2 The integer data type

The integer data type stores just integers in the range of \(-2,000,000,000\) to \(2,000,000,000\). This data type is rarely used in practice, but it helps to illustrate the nuance of some data types and why its important to check what something is.

i <- as.integer(5)
i
[1] 5
[1] "integer"
[1] "numeric"

Notice that you need to explicitly set the variable with the as.integer() function because the output looks exactly the same as if you set i <- 5. The primary difference between the two is size. If you have a very large amount of numbers you need to store, and they are all integers, the integer data type may be more appropriate.

3.3.3 The complex data type

The complex data type is also rarely used, but is sometimes necessary in mathematical or engineering applications. This data type is used to represent complex numbers.

c <- 3+2i
c
[1] 3+2i
[1] "complex"

Unlike the integer data type, complex objects ar automatically detected by R if you use the \(a+bi\) notation. However, if you want to be extra careful, you can use the complex() or as.complex() functions.

complex(real = 3, imaginary = 2)
[1] 3+2i
[1] 3+2i

3.3.4 Special Numbers and Functions

There are a few special numbers and functions to be aware of:

Pi, \(\pi\)

pi
[1] 3.141593

Euler’s number, \(e\)

exp(1) # e = e^1
[1] 2.718282
exp(5) # e^5
[1] 148.4132

Logarithms

The log() function, by default takes the natural log of a number, \(ln()\), but you can specify another base using the the base argument.

log(2)
[1] 0.6931472
log(2, base = 10)
[1] 0.30103
log(2) == log(2, base = exp(1))
[1] TRUE

3.4 The logical data type

We have already seen the logical data type earlier in this workshop. This data type can either be TRUE or FALSE. Typically, this data type is not created explicitly, but rather through Boolean Comparisons.

x <- TRUE # explicitly
class(x)
[1] "logical"
x
[1] TRUE
y <- 1 == 1 # boolean comparisons
y
[1] TRUE
z <- "apples" == "oranges"
z
[1] FALSE

If you use mathematical operations on a logical data type, then it will treat TRUE as \(1\) and FALSE as \(0\). This can be helpful because you can add together a bunch of logical data types to easily see how many cases are TRUE.

x + y + z # TRUE + TRUE + FALSE = 1 + 1 + 0
[1] 2

3.5 The character data type

The character data type is used to to hold strings of characters.

x <- "apple"
class(x)
[1] "character"
x
[1] "apple"

Notice that the output from the class() function actually output another character object which contains the word "character"! This illustrates how everything in R is an object.

3.6 Coercion

Sometimes you may need to move between data types, which is known as data type coercion. A common example is moving between numeric and string.

x <- 2
class(x)
[1] "numeric"
x
[1] 2
y <- as.character(x)
class(y)
[1] "character"
y
[1] "2"

Other functions that do data type coercion include:

as.integer()
as.numeric()
as.logical()

# We'll visit these in the next section
as.factor()
as.vector() 
as.data.frame()

3.7 Missing Values

In R, a missing value is denoted by NA. Any variable can be set to be a missing value, and you can check if a variables is a missing value using the is.na() function. The real benefit of missing values is when they are used in data structures, which we will explore in the next chapter.

x <- NA
is.na(x)
[1] TRUE
x <- 2
is.na(x)
[1] FALSE
x <- ""
is.na(x)
[1] FALSE

Occasionally you may see NaN, which occurs when you divide by zero, for example. It stands for “Not a Number”, and can function as an NA. However, an NA is not considered a NaN when using the is.nan() function, so be careful.

is.na(NaN)
[1] TRUE
is.nan(NaN)
[1] TRUE
is.na(NA)
[1] TRUE
is.nan(NA)
[1] FALSE