# A tibble: 1 × 7
carat depth table price x y z
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 0.798 61.7 57.5 3933. 5.73 5.73 3.54
where() functions
where(is.numeric) selects all numeric columns.
where(is.character) selects all string columns.
where(is.Date) selects all date columns.
where(is.POSIXct) selects all date-time columns.
where(is.logical) selects all logical columns.
What about when the function requires additional arguments?
# A tibble: 5 × 4
a b c d
<dbl> <dbl> <dbl> <dbl>
1 1.91 -1.42 -0.615 -0.221
2 -1.12 -1.28 -1.89 0.383
3 NA 0.227 -0.945 -0.564
4 -1.11 1.30 NA 1.56
5 1.83 NA NA 1.02
df_miss |>summarize(across(a:d, median))
# A tibble: 1 × 4
a b c d
<dbl> <dbl> <dbl> <dbl>
1 NA NA NA 0.383
The output contains missing values. We need to pass the argument na.rm=TRUE.
What about when the function requires additional arguments?
Warning: There was 1 warning in `summarize()`.
ℹ In argument: `across(a:d, median, na.rm = TRUE)`.
Caused by warning:
! The `...` argument of `across()` is deprecated as of dplyr 1.1.0.
Supply arguments directly to `.fns` through an anonymous function instead.
# Previously
across(a:b, mean, na.rm = TRUE)
# Now
across(a:b, \(x) mean(x, na.rm = TRUE))
# A tibble: 1 × 4
a b c d
<dbl> <dbl> <dbl> <dbl>
1 0.359 -0.529 -0.945 0.383
What about when the function requires additional arguments?
You can call a new function “on the fly” - the function is not saved as an object to the global environment.
- Put it all in "" - {.fn} takes the function name - {.col} takes the column name - _ indicates we want to separate the function and column name by _ - customize however you want
# A tibble: 305 × 7
date river salinity param site datetime ph
<chr> <chr> <chr> <chr> <chr> <chr> <dbl>
1 2024-05-21 YOR BR PH A 5/21/24 9:50 8.05
2 2024-05-21 YOR BR PH A 5/21/24 9:55 8.06
3 2024-05-21 YOR BR PH A 5/21/24 10:00 8.01
4 2024-05-21 YOR BR PH A 5/21/24 10:05 8.05
5 2024-05-21 YOR BR PH A 5/21/24 10:10 8.03
6 2024-05-21 YOR BR PH A 5/21/24 10:15 8.05
7 2024-05-21 YOR BR PH A 5/21/24 10:20 8.05
8 2024-05-21 YOR BR PH A 5/21/24 10:25 8.03
9 2024-05-21 YOR BR PH A 5/21/24 10:30 8.03
10 2024-05-21 YOR BR PH A 5/21/24 10:35 8.03
# ℹ 295 more rows
You can use map() (and the family of other map() functions) to perform almost any operation across multiple elements of a vector or list. More to come.