# tapply {base}

### Description

Apply a function to each cell of a ragged array, that is to each (non-empty) group of values given by a unique combination of the levels of certain factors.

### Usage

tapply(X, INDEX, FUN = NULL, ..., simplify = TRUE)

### Arguments

- X
- an atomic object, typically a vector.
- INDEX
- list of one or more factors, each of same length as
`X`

. The elements are coerced to factors by`as.factor`

. - FUN
- the function to be applied, or
`NULL`

. In the case of functions like`+`

,`%*%`

, etc., the function name must be backquoted or quoted. If`FUN`

is`NULL`

, tapply returns a vector which can be used to subscript the multi-way array`tapply`

normally produces. - ...
- optional arguments to
`FUN`

: the Note section. - simplify
- If
`FALSE`

,`tapply`

always returns an array of mode`"list"`

. If`TRUE`

(the default), then if`FUN`

always returns a scalar,`tapply`

returns an array with the mode of the scalar.

### Values

If `FUN`

is not `NULL`

, it is passed to `match.fun`

, and hence it can be a function or a symbol or character string naming a function.

When `FUN`

is present, `tapply`

calls `FUN`

for each cell that has any data in it. If `FUN`

returns a single atomic value for each such cell (e.g., functions `mean`

or `var`

) and when `simplify`

is `TRUE`

, `tapply`

returns a multi-way array containing the values, and `NA`

for the empty cells. The array has the same number of dimensions as `INDEX`

has components; the number of levels in a dimension is the number of levels (`nlevels()`

) in the corresponding component of `INDEX`

. Note that if the return value has a class (e.g. an object of class `"Date"`

) the class is discarded.

Note that contrary to S, `simplify = TRUE`

always returns an array, possibly 1-dimensional.

If `FUN`

does not return a single atomic value, `tapply`

returns an array of mode `list`

whose components are the values of the individual calls to `FUN`

, i.e., the result is a list with a `dim`

attribute.

When there is an array answer, its `dimnames`

are named by the names of `INDEX`

and are based on the levels of the grouping factors (possibly after coercion).

For a list result, the elements corresponding to empty cells are `NULL`

.

### References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988) *The New S Language*. Wadsworth & Brooks/Cole.

### Note

Optional arguments to `FUN`

supplied by the `...`

argument are not divided into cells. It is therefore inappropriate for `FUN`

to expect additional arguments with the same length as `X`

.

### See Also

the convenience functions `by`

and `aggregate`

(using `tapply`

); `apply`

, `lapply`

with its versions `sapply`

and `mapply`

.

### Examples

require(stats) groups <- as.factor(rbinom(32, n = 5, prob = 0.4)) tapply(groups, groups, length) #- is almost the same as table(groups) ## contingency table from data.frame : array with named dimnames tapply(warpbreaks$breaks, warpbreaks[,-1], sum) tapply(warpbreaks$breaks, warpbreaks[, 3, drop = FALSE], sum) n <- 17; fac <- factor(rep(1:3, length = n), levels = 1:5) table(fac) tapply(1:n, fac, sum) tapply(1:n, fac, sum, simplify = FALSE) tapply(1:n, fac, range) tapply(1:n, fac, quantile) ## example of ... argument: find quarterly means tapply(presidents, cycle(presidents), mean, na.rm = TRUE) ind <- list(c(1, 2, 2), c("A", "A", "B")) table(ind) tapply(1:3, ind) #-> the split vector tapply(1:3, ind, sum)

Documentation reproduced from R 3.0.2. License: GPL-2.