Environmental Data Analysis and Visualization
A survey asked respondents their name and number of cats. The instructions said to enter the number of cats as a numerical value.
# A tibble: 60 × 3
name number_of_cats handedness
<chr> <chr> <chr>
1 Bernice Warren 0 left
2 Woodrow Stone 0 left
3 Willie Bass 1 left
4 Tyrone Estrada 3 left
5 Alex Daniels 3 left
6 Jane Bates 2 left
7 Latoya Simpson 1 left
8 Darin Woods 1 left
9 Agnes Cobb 0 left
10 Tabitha Grant 0 left
# ℹ 50 more rows
What is the “type” of the number_of_cats
variable?
If your data does not behave how you expect it to, type coercion when reading in the data might be the reason.
Go in and investigate your data, apply the fix, save your data, live happily ever after.
logical - boolean values TRUE
and FALSE
character - character strings
double - floating point numerical values (default numerical type)
integer - integer numerical values (indicated with an L
)
Vectors can be constructed using the c()
function.
R will happily convert between various types without complaint when different types of data are concatenated in a vector, and that’s not always a great thing!
Let’s give formal names to what we’ve seen so far:
Explicit coercion is when you call a function like as.logical()
, as.numeric()
, as.integer()
, as.double()
, or as.character()
Implicit coercion happens when you use a vector in a specific context that expects a certain type of vector (like combining multiple data types in one)
Suppose we want to know the type of c(1, "a")
.
First, I’d look at:
and make a guess about what type R thinks the vector is based on the type of each element of the vector.
Suppose we want to know the type of c(1, "a")
.
First, I’d look at:
and make a guess based on these. Then finally I’d check:
NA
: Not availableNaN
: Not a numberInf
: Positive infinity-Inf
: Negative infinityNA
s are special ❄️sNA
s are special ❄️sSome functions will not execute if the data contains NAs. Usually, they include an optional argument to specify whether to remove NAs. Otherwise you can use the drop_na()
function to remove them yourself.
AE 07 - Data types and classes
> open type-coercion.qmd
.
What is the type of the given vectors? First, guess. Then, try it out in R. If your guess was correct, great! If not, discuss why they have that type.