Manipulate Query (dplyr Databases)
This is a follow-up to Source and Query.
Dr. Undómiel agrees with you that the difference in male and female D. spectabilis hind foot length and weight seems pretty small, but wants to make a more detailed comparison. She wants you to find the male and female hind foot length and weight for all species of rodent on all of the plots (not just the controls) and quantitatively define the size differences among species.
-
Produce a data frame with
species_id
,sex
,avg_hindfoot_length
, andavg_weight
for each species. Your data frame should have two rows for each species, one row for each sex.You can solve this problem with
dplyr
in a variety of ways including writing a query or using data manipulation verbs to group and select the data. You could also decide to usefor
loops orapply
statements. Take whichever approach you like best. -
Write a function that determines if the absolute difference in average male and female size is less than the standard deviation of sizes for all individuals (
abs(male - female) <= stdev
). -
Manipulate the data so that you have a local data frame that has average male and female
hindfoot_length
andweight
and the standard deviations in a single row for each species. -
Use
transmute()
and anifelse()
with your function to take each specieshindfoot_length
andweight
from your local data frame and make a new data frame as that labels the results of your simple calculation as"SAME"
or"DIFF"
.You may find that you get an
Error: non-numeric argument to binary operator
. The missing size data causesmean()
to return results as acharacter
. Remove the missing data from your query or re-class the resultsas.numeric()
to make your calculation.