dplyr Aggregation

Remember to

Load surveys.csv data into surveys

group_by(surveys, species_id)

Different looking kind of data.frame
- Source, grouping, and data type information
Store the data frame in a variable to use in the next step

surveys_by_species <- group_by(surveys, species_id)

After grouping a data frame use summarize() to calculate values for each group.
Count the number of rows for each group (individuals in each species).
summarize
Arguments
- Table to work on, which needs to be a grouped table
- One additional argument for each calculation we want to do for each group
  - New column name to store calculated value
  - =
  - Calculation that we want to perform for each group
  - We’ll use the function n which is a special function that counts the rows in the table

summarize(surveys_by_species, abundance = n())

surveys_by_species_plot <- group_by(surveys, species_id, plot_id)
species_plot_counts <- summarize(surveys_by_species_plot, abundance = n())

species_weight <- summarize(surveys_by_species_plot, avg_weight = mean(weight))

Open table
Why did we get NA?
- mean(weight) returns NA when weight has missing values (NA)
Can fix using mean(weight, na.rm = TRUE)

species_weight <- summarize(surveys_by_species,
                            avg_weight = mean(weight, na.rm = TRUE))

filter(species_weight, !is.na(avg_weight))

Do Shrub Volume Aggregation.

Notes