# cov.rob {MASS}

### Description

Compute a multivariate location and scale estimate with a high breakdown point -- this can be thought of as estimating the mean and covariance of the `good`

part of the data. `cov.mve`

and `cov.mcd`

are compatibility wrappers.

### Usage

cov.rob(x, cor = FALSE, quantile.used = floor((n + p + 1)/2), method = c("mve", "mcd", "classical"), nsamp = "best", seed) cov.mve(...) cov.mcd(...)

### Arguments

- x
- a matrix or data frame.
- cor
- should the returned result include a correlation matrix?
- quantile.used
- the minimum number of the data points regarded as
`good`

points. - method
- the method to be used -- minimum volume ellipsoid, minimum covariance determinant or classical product-moment. Using
`cov.mve`

or`cov.mcd`

forces`mve`

or`mcd`

respectively. - nsamp
- the number of samples or
`"best"`

or`"exact"`

or`"sample"`

. If`"sample"`

the number chosen is`min(5*p, 3000)`

, taken from Rousseeuw and Hubert (1997). If`"best"`

exhaustive enumeration is done up to 5000 samples: if`"exact"`

exhaustive enumeration will be attempted however many samples are needed. - seed
- the seed to be used for random sampling: see
`RNGkind`

. The current value of`.Random.seed`

will be preserved if it is set. - ...
- arguments to
`cov.rob`

other than`method`

.

### Details

For method `"mve"`

, an approximate search is made of a subset of size `quantile.used`

with an enclosing ellipsoid of smallest volume; in method `"mcd"`

it is the volume of the Gaussian confidence ellipsoid, equivalently the determinant of the classical covariance matrix, that is minimized. The mean of the subset provides a first estimate of the location, and the rescaled covariance matrix a first estimate of scatter. The Mahalanobis distances of all the points from the location estimate for this covariance matrix are calculated, and those points within the 97.5% point under Gaussian assumptions are declared to be `good`

. The final estimates are the mean and rescaled covariance of the `good`

points.

The rescaling is by the appropriate percentile under Gaussian data; in addition the first covariance matrix has an *ad hoc* finite-sample correction given by Marazzi.

For method `"mve"`

the search is made over ellipsoids determined by the covariance matrix of `p`

of the data points. For method `"mcd"`

an additional improvement step suggested by Rousseeuw and van Driessen (1999) is used, in which once a subset of size `quantile.used`

is selected, an ellipsoid based on its covariance is tested (as this will have no larger a determinant, and may be smaller).

### Values

A list with components

- center
- the final estimate of location.
- cov
- the final estimate of scatter.
- cor
- (only is
`cor = TRUE`

) the estimate of the correlation matrix. - sing
- message giving number of singular samples out of total
- crit
- the value of the criterion on log scale. For MCD this is the determinant, and for MVE it is proportional to the volume.
- best
- the subset used. For MVE the best sample, for MCD the best set of size
`quantile.used`

. - n.obs
- total number of observations.

### References

P. J. Rousseeuw and A. M. Leroy (1987) *Robust Regression and Outlier Detection.* Wiley.

A. Marazzi (1993) *Algorithms, Routines and S Functions for Robust Statistics.* Wadsworth and Brooks/Cole.

P. J. Rousseeuw and B. C. van Zomeren (1990) Unmasking multivariate outliers and leverage points, *Journal of the American Statistical Association*, **85**, 633--639.

P. J. Rousseeuw and K. van Driessen (1999) A fast algorithm for the minimum covariance determinant estimator. *Technometrics* **41**, 212--223.

P. Rousseeuw and M. Hubert (1997) Recent developments in PROGRESS. In *L1-Statistical Procedures and Related Topics * ed Y. Dodge, IMS Lecture Notes volume **31**, pp. 201--214.

### See Also

Documentation reproduced from package MASS, version 7.3-45. License: GPL-2 | GPL-3