# model.frame {stats}

### Description

`model.frame`

(a generic function) and its methods return a `data.frame`

with the variables needed to use `formula`

and any `...`

arguments.

### Usage

model.frame(formula, ...) ## S3 method for class 'default': model.frame((formula, data = NULL, subset = NULL, na.action = na.fail, drop.unused.levels = FALSE, xlev = NULL, ...)) ## S3 method for class 'aovlist': model.frame((formula, data = NULL, ...)) ## S3 method for class 'glm': model.frame((formula, ...)) ## S3 method for class 'lm': model.frame((formula, ...) get_all_vars(formula, data, ...))

### Arguments

- formula
- a model
`formula`

or`terms`

object or an R object. - data
- a data.frame, list or environment (or object coercible by
`as.data.frame`

to a data.frame), containing the variables in`formula`

. Neither a matrix nor an array will be accepted. - subset
- a specification of the rows to be used: defaults to all rows. This can be any valid indexing vector (see
`[.data.frame`

) for the rows of`data`

or if that is not supplied, a data frame made up of the variables used in`formula`

. - na.action
- how
`NA`

s are treated. The default is first, any`na.action`

attribute of`data`

, second a`na.action`

setting of`options`

, and third`na.fail`

if that is unset. The ‘factory-fresh’ default is`na.omit`

. Another possible value is`NULL`

. - drop.unused.levels
- should factors have unused levels dropped? Defaults to
`FALSE`

. - xlev
- a named list of character vectors giving the full set of levels to be assumed for each factor.
- ...
- further arguments such as
`data`

,`na.action`

,`subset`

. Any additional arguments such as`offset`

and`weights`

which reach the default method are used to create further columns in the model frame, with parenthesised names such as`"(offset)"`

.

### Details

Exactly what happens depends on the class and attributes of the object `formula`

. If this is an object of fitted-model class such as `"lm"`

, the method will either return the saved model frame used when fitting the model (if any, often selected by argument `model = TRUE`

) or pass the call used when fitting on to the default method. The default method itself can cope with rather standard model objects such as those of class `"lqs"`

from package MASS if no other arguments are supplied.

The rest of this section applies only to the default method.

If either `formula`

or `data`

is already a model frame (a data frame with a `"terms"`

attribute) and the other is missing, the model frame is returned. Unless `formula`

is a terms object, `as.formula`

and then `terms`

is called on it. (If you wish to use the `keep.order`

argument of `terms.formula`

, pass a terms object rather than a formula.)

Row names for the model frame are taken from the `data`

argument if present, then from the names of the response in the formula (or rownames if it is a matrix), if there is one.

All the variables in `formula`

, `subset`

and in `...`

are looked for first in `data`

and then in the environment of `formula`

(see the help for `formula()`

for further details) and collected into a data frame. Then the `subset`

expression is evaluated, and it is used as a row index to the data frame. Then the `na.action`

function is applied to the data frame (and may well add attributes). The levels of any factors in the data frame are adjusted according to the `drop.unused.levels`

and `xlev`

arguments: if `xlev`

specifies a factor and a character variable is found, it is converted to a factor (as from R 2.10.0).

Unless `na.action = NULL`

, time-series attributes will be removed from the variables found (since they will be wrong if `NA`

s are removed).

Note that *all* the variables in the formula are included in the data frame, even those preceded by `-`

.

Only variables whose type is raw, logical, integer, real, complex or character can be included in a model frame: this includes classed variables such as factors (whose underlying type is integer), but excludes lists.

`get_all_vars`

returns a `data.frame`

containing the variables used in `formula`

plus those specified `...`

. Unlike `model.frame.default`

, it returns the input variables and not those resulting from function calls in `formula`

.

### Values

A `data.frame`

containing the variables used in `formula`

plus those specified in `...`

. It will have additional attributes, including `"terms"`

for an object of class `"terms"`

derived from `formula`

, and possibly `"na.action"`

giving information on the handling of `NA`

s (which will not be present if no special handling was done, e.g. by `na.pass`

).

### References

Chambers, J. M. (1992) *Data for models.* Chapter 3 of *Statistical Models in S* eds J. M. Chambers and T. J. Hastie, Wadsworth & Brooks/Cole.

### See Also

`model.matrix`

for the ‘design matrix’, `formula`

for formulas and `expand.model.frame`

for model.frame manipulation.

### Examples

data.class(model.frame(dist ~ speed, data = cars))

Documentation reproduced from R 3.0.2. License: GPL-2.