R - Formula Objects
Formulae convey a relationship among a set of variables. We can define a formula without having any data loaded.
~ x # formula that defines a single independent variable, "x", pretty useless y ~ x # one dependent variable, translates to "y" depends on "x".
~ creates a formula object. They are used differently by different libraries, but the original intent was to allow specify "which variables does the left side depend on?"
# left of ~ is the dependent variable, the "outcome" or "result" # right of ~ are the independent/predictor/covariate variables myFormula <- Species ~ Sepal.Length + Sepal.Width + Petal.Length + Petal.Width myFormula # Species ~ Sepal.Length + Sepal.Width + Petal.Length + Petal.Width # you would read this as "Species depends on Sepal.Length, Sepal.Width..." allFormula <- Species ~ . allFormula # Species ~ . # . in formula translates to "all variables not yet used" # you would read this as "Species depends on all the other variables."
- An expression of
y ~ modelis interpreted as the response
yis modeled by a predictor specified symbolically by
+operator is used to separate terms in a model.
:operator is used to separate variable and factor names in those terms.
*operator denotes factor crossing:
^operator indicates crossing to the specified degree:
(a+b+c)^2is identical to
%in%operator indicates that the terms on its left are nested within those on the right:
a + b %in% aexpands to the formula
a + a:b
-operator removes the specified terms:
(a+b+c)^2 - a:bis identical to
a + b + c + b:c + a:c.