Skip to Content

write.dta {foreign}

Write Files in Stata Binary Format
Package: 
foreign
Version: 
0.8-59

Description

Writes the data frame to file in the Stata binary format. Does not write array variables unless they can be drop-ed to a vector.

Frozen: will not support Stata formats after 10 (aka 11).

Usage

write.dta(dataframe, file, version = 7L,
          convert.dates = TRUE, tz = "GMT",
          convert.factors = c("labels", "string", "numeric", "codes"))

Arguments

dataframe
a data frame.
file
character string giving filename.
version
integer: Stata version: 6, 7, 8 and 10 are supported, and 9 is mapped to 8, 11 to 10.
convert.dates
Convert Date and POSIXt objects to Stata dates?
tz
timezone for date conversion
convert.factors
how to handle factors

Details

The major differences between file formats in Stata versions is that version 7.0 and later allow 32-character variable names (5 and 6 were restricted to 8-character name). The abbreviate function is used to trim long variables to the permitted length. A warning is given if this is needed and it is an error for the abbreviated names not to be unique.

The columns in the data frame become variables in the Stata data set. Missing values are handled correctly.

Unless deselected by argument convert.dates, R date and date-time objects (POSIXt classes) are converted into the Stata format. For date-time objects this may lose information -- Stata dates are in days since 1960-1-1. POSIXct objects can be written without conversion but will not be understood as dates by Stata; POSIXlt objects cannot be written without conversion.

There are four options for handling factors. The default is to use Stata ‘value labels’ for the factor levels. With convert.factors="string", the factor levels are written as strings (the name of the value label is taken from the "val.labels" attribute if it exists or the variable name otherwise). With convert.factors="numeric" the numeric values of the levels are written, or NA if they cannot be coerced to numeric. Finally, convert.factors="codes" writes the underlying integer codes of the factors. This last used to be the only available method and is provided largely for backwards compatibility.

If the "label.table" contains value labels with names not already attached to a variable (not the variable name or name from "val.labels") then these will be written out as well.

If the "datalabel" attribute contains a string, it is written out as the dataset label otherwise the dataset label is "Written by R.".

If the "expansion.table" attribute exists expansion fields are written. This attribute should contain a list where each element is string vector of length three. The first vector element contains the name of a variable or "_dta" (meaning the dataset). The second element contains the characeristic name. The third contains the associated data.

If the "val.labels" attribute contains a string vector with a string label for each variable then this is written as the variable labels. Otherwise the variable names are repeated as variable labels.

If the "var.labels" attribute contains a string vector with a string label for each variable then this is written as the variable labels. Otherwise the variable names are repeated as variable labels.

For Stata 8 or later use the default version=7 -- the only advantage of Stata 8 format is that it can represent multiple different missing value types, and R doesn't have them. Stata 10/11 allows longer format lists, but R does not make use of them.

Note that the Stata formats are documented to use ASCII strings -- R does not enforce this, but use of non-ASCII character strings will not be portable as the encoding is not recorded. Up to 244 bytes are allowed in character data, and longer strings will be truncated with a warning.

Stata uses some large numerical values to represent missing values. This function does not currently check, and hence integers greater than 2147483620 and doubles greater than 8.988e+307 may be misinterpreted by Stata.

Values

NULL

References

Stata 6.0 Users Manual, Stata 7.0 Programming manual, Stata online help (version 8 and later, also http://www.stata.com/help.cgi?dta_114 and http://www.stata.com/help.cgi?dta_113) describe the file formats.

Examples

write.dta(swiss, swissfile <- tempfile())
read.dta(swissfile)

Author(s)

Thomas Lumley and R-core members: support for value labels by Brian Quistorff.

Documentation reproduced from package foreign, version 0.8-59. License: GPL (>= 2)