x <- 2 # reassign your variable if you need to
class(x) # call the class() function on x
[1] "numeric"
numeric
, logical
, and character
To identify the data type of an object, you can use the class()
function. Let’s use the class()
function on the x
variable we created in the last section:
x <- 2 # reassign your variable if you need to
class(x) # call the class() function on x
[1] "numeric"
As you can see, our class()
function showed us that x
, which holds a 2
, is a of the numeric
data type. We will use this class()
function throughout this section to test our understanding.
There are 3 types of number data types in R, but you will usually just use the numeric
type.
numeric
data typeThe numeric
data type can represent any number (negative or positive), with any amount of decimals. There are a lot of ways to manipulate these numbers in R. For example:
When dealing with decimals, you can round numbers or floor them using the round()
and floor()
functions.
integer
data typeThe integer
data type stores just integers in the range of \(-2,000,000,000\) to \(2,000,000,000\). This data type is rarely used in practice, but it helps to illustrate the nuance of some data types and why its important to check what something is.
Notice that you need to explicitly set the variable with the as.integer()
function because the output looks exactly the same as if you set i <- 5
. The primary difference between the two is size. If you have a very large amount of numbers you need to store, and they are all integers, the integer
data type may be more appropriate.
complex
data typeThe complex
data type is also rarely used, but is sometimes necessary in mathematical or engineering applications. This data type is used to represent complex numbers.
Unlike the integer
data type, complex
objects ar automatically detected by R if you use the \(a+bi\) notation. However, if you want to be extra careful, you can use the complex()
or as.complex()
functions.
There are a few special numbers and functions to be aware of:
pi
[1] 3.141593
The log()
function, by default takes the natural log of a number, \(ln()\), but you can specify another base using the the base
argument.
logical
data typeWe have already seen the logical
data type earlier in this workshop. This data type can either be TRUE
or FALSE
. Typically, this data type is not created explicitly, but rather through Boolean Comparisons.
x <- TRUE # explicitly
class(x)
[1] "logical"
x
[1] TRUE
y <- 1 == 1 # boolean comparisons
y
[1] TRUE
z <- "apples" == "oranges"
z
[1] FALSE
If you use mathematical operations on a logical
data type, then it will treat TRUE
as \(1\) and FALSE
as \(0\). This can be helpful because you can add together a bunch of logical
data types to easily see how many cases are TRUE
.
x + y + z # TRUE + TRUE + FALSE = 1 + 1 + 0
[1] 2
character
data typeThe character
data type is used to to hold strings of characters.
Notice that the output from the class()
function actually output another character
object which contains the word "character"
! This illustrates how everything in R is an object.
Sometimes you may need to move between data types, which is known as data type coercion. A common example is moving between numeric
and string
.
Other functions that do data type coercion include:
as.integer()
as.numeric()
as.logical()
# We'll visit these in the next section
as.factor()
as.vector()
as.data.frame()
In R, a missing value is denoted by NA
. Any variable can be set to be a missing value, and you can check if a variables is a missing value using the is.na()
function. The real benefit of missing values is when they are used in data structures, which we will explore in the next chapter.
Occasionally you may see NaN
, which occurs when you divide by zero, for example. It stands for “Not a Number”, and can function as an NA
. However, an NA
is not considered a NaN
when using the is.nan()
function, so be careful.