# matchit {MatchIt}

### Description

`matchit`

is the main command of the package *MatchIt*, which enables parametric models for causal inference to work better by selecting well-matched subsets of the original treated and control groups. MatchIt implements the suggestions of Ho, Imai, King, and Stuart (2004) for improving parametric statistical models by preprocessing data with nonparametric matching methods. MatchIt implements a wide range of sophisticated matching methods, making it possible to greatly reduce the dependence of causal inferences on hard-to-justify, but commonly made, statistical modeling assumptions. The software also easily fits into existing research practices since, after preprocessing with MatchIt, researchers can use whatever parametric model they would have used without MatchIt, but produce inferences with substantially more robustness and less sensitivity to modeling assumptions. Matched data sets created by MatchIt can be entered easily in Zelig (http://gking.harvard.edu/zelig) for subsequent parametric analyses. Full documentation is available online at http://gking.harvard.edu/matchit, and help for specific commands is available through `help.matchit`

.

### Usage

matchit(formula, data, method = "nearest", distance = "logit", distance.options = list(), discard = "none", reestimate = FALSE, ...)

### Arguments

- formula
- This argument takes the usual syntax of R formula,
`treat ~ x1 + x2`

, where`treat`

is a binary treatment indicator and`x1`

and`x2`

are the pre-treatment covariates. Both the treatment indicator and pre-treatment covariates must be contained in the same data frame, which is specified as`data`

(see below). All of the usual R syntax for formula works. For example,`x1:x2`

represents the first order interaction term between`x1`

and`x2`

, and`I(x1^2)`

represents the square term of`x1`

. See`help(formula)`

for details. - data
- This argument specifies the data frame containing the variables called in
`formula`

. - method
- This argument specifies a matching method. Currently,
`"exact"`

(exact matching),`"full"`

(full matching),`"genetic"`

(genetic matching),`"nearest"`

(nearest neighbor matching),`"optimal"`

(optimal matching), and`"subclass"`

(subclassification) are available. The default is`"nearest"`

. Note that within each of these matching methods,*MatchIt*offers a variety of options. - distance
- This argument specifies the method used to estimate the distance measure. The default is logistic regression,
`"logit"`

. A variety of other methods are available. - distance.options
- This optional argument specifies the optional arguments that are passed to the model for estimating the distance measure. The input to this argument should be a list.
- discard
- This argument specifies whether to discard units that fall outside some measure of support of the distance score before matching, and not allow them to be used at all in the matching procedure. Note that discarding units may change the quantity of interest being estimated. The options are:
`"none"`

(default), which discards no units before matching,`"both"`

, which discards all units (treated and control) that are outside the support of the distance measure,`"control"`

, which discards only control units outside the support of the distance measure of the treated units, and`"treat"`

, which discards only treated units outside the support of the distance measure of the control units. - reestimate
- This argument specifies whether the model for distance measure should be re-estimated after units are discarded. The input must be a logical value. The default is
`FALSE`

. - ...
- Additional arguments to be passed to a variety of matching methods.

### Details

The matching is done using the `matchit(treat ~ X, ...)`

command, where `treat`

is the vector of treatment assignments and `X`

are the covariates to be used in the matching. There are a number of matching options, detailed below. The full syntax is `matchit(formula, data=NULL, discard=0, exact=FALSE, replace=FALSE, ratio=1, model="logit", reestimate=FALSE, nearest=TRUE, m.order=2, caliper=0, calclosest=FALSE, mahvars=NULL, subclass=0, sub.by="treat", counter=TRUE, full=FALSE, full.options=list(), ...)`

A summary of the results can be seen graphically using `plot(matchitobject)`

, or numerically using `summary(matchitobject)`

. `print(matchitobject)`

also prints out the output.

### Values

- call
- The original
`matchit`

call. - formula
- The formula used to specify the model for estimating the distance measure.
- model
- The output of the model used to estimate the distance measure.
`summary(m.out$model)`

will give the summary of the model where`m.out`

is the output object from`matchit`

. - match.matrix
- An n_1 by
`ratio`

matrix where the row names, which can be obtained through`row.names(match.matrix)`

, represent the names of the treatment units, which come from the data frame specified in`data`

. Each column stores the name(s) of the control unit(s) matched to the treatment unit of that row. For example, when the`ratio`

input for nearest neighbor or optimal matching is specified as 3, the three columns of`match.matrix`

represent the three control units matched to one treatment unit).`NA`

indicates that the treatment unit was not matched. - discarded
- A vector of length $n$ that displays whether the units were ineligible for matching due to common support restrictions. It equals
`TRUE`

if unit i was discarded, and it is set to`FALSE`

otherwise. - distance
- A vector of length n with the estimated distance measure for each unit.
- weights
- A vector of length n that provides the weights assigned to each unit in the matching process. Unmatched units have weights equal to
`1`

. Each matched control unit has weight proportional to the number of treatment units to which it was matched, and the sum of the control weights is equal to the number of uniquely matched control units. - subclass
- The subclass index in an ordinal scale from 1 to the total number of subclasses as specified in
`subclass`

(or the total number of subclasses from full or exact matching). Unmatched units have`NA`

. - q.cut
- The subclass cut-points that classify the distance measure.
- treat
- The treatment indicator from
`data`

(the left-hand side of`formula`

). - X
- The covariates used for estimating the distance measure (the right-hand side of
`formula`

). - nn
- A basic summary table of matched data (e.g., the number of matched units)

### References

Daniel Ho, Kosuke Imai, Gary King, and Elizabeth Stuart (2007). Matching as Nonparametric Preprocessing for Reducing Model Dependence in Parametric Causal Inference. Political Analysis 15(3): 199-236. http://gking.harvard.edu/files/abs/matchp-abs.shtml

### See Also

Please use `help.matchit`

to access the matchit reference manual. The complete document is available online at http://gking.harvard.edu/matchit.

Documentation reproduced from package MatchIt, version 2.4-21. License: GPL (>= 2)