hist {graphics}
Description
The generic function hist computes a histogram of the given data values. If plot = TRUE, the resulting object of class "histogram" is plotted by plot.histogram, before it is returned.
Usage
hist(x, ...)
## S3 method for class 'default':
hist((x, breaks = "Sturges",
freq = NULL, probability = !freq,
include.lowest = TRUE, right = TRUE,
density = NULL, angle = 45, col = NULL, border = NULL,
main = paste("Histogram of" , xname),
xlim = range(breaks), ylim = NULL,
xlab = xname, ylab,
axes = TRUE, plot = TRUE, labels = FALSE,
nclass = NULL, warn.unused = TRUE, ...))
Arguments
- x
- a vector of values for which the histogram is desired.
- breaks
- one of:
- a vector giving the breakpoints between histogram cells,
- a function to compute the vector of breakpoints,
- a single number giving the number of cells for the histogram,
- a character string naming an algorithm to compute the number of cells (see ‘Details’),
- a function to compute the number of cells.
In the last three cases the number is a suggestion only. If
breaksis a function, thexvector is supplied to it as the only argument. - freq
- logical; if
TRUE, the histogram graphic is a representation of frequencies, thecountscomponent of the result; ifFALSE, probability densities, componentdensity, are plotted (so that the histogram has a total area of one). Defaults toTRUEif and only ifbreaksare equidistant (andprobabilityis not specified). - probability
- an alias for
!freq, for S compatibility. - include.lowest
- logical; if
TRUE, anx[i]equal to thebreaksvalue will be included in the first (or last, forright = FALSE) bar. This will be ignored (with a warning) unlessbreaksis a vector. - right
- logical; if
TRUE, the histogram cells are right-closed (left open) intervals. - density
- the density of shading lines, in lines per inch. The default value of
NULLmeans that no shading lines are drawn. Non-positive values ofdensityalso inhibit the drawing of shading lines. - angle
- the slope of shading lines, given as an angle in degrees (counter-clockwise).
- col
- a colour to be used to fill the bars. The default of
NULLyields unfilled bars. - border
- the color of the border around the bars. The default is to use the standard foreground color.
- main, xlab, ylab
- these arguments to
titlehave useful defaults here. - xlim, ylim
- the range of x and y values with sensible defaults. Note that
xlimis not used to define the histogram (breaks), but only for plotting (whenplot = TRUE). - axes
- logical. If
TRUE(default), axes are draw if the plot is drawn. - plot
- logical. If
TRUE(default), a histogram is plotted, otherwise a list of breaks and counts is returned. In the latter case, a warning is used if (typically graphical) arguments are specified that only apply to theplot = TRUEcase. - labels
- logical or character. Additionally draw labels on top of bars, if not
FALSE; seeplot.histogram. - nclass
- numeric (integer). For S(-PLUS) compatibility only,
nclassis equivalent tobreaksfor a scalar or character argument. - warn.unused
- logical. If
plot = FALSEandwarn.unused = TRUE, a warning will be issued when graphical parameters are passed tohist.default(). - ...
- further arguments and graphical parameters passed to
plot.histogramand thence totitleandaxis(ifplot = TRUE).
Details
The definition of histogram differs by source (with country-specific biases). R's default with equi-spaced breaks (also the default) is to plot the counts in the cells defined by breaks. Thus the height of a rectangle is proportional to the number of points falling into the cell, as is the area provided the breaks are equally-spaced.
The default with non-equi-spaced breaks is to give a plot of area one, in which the area of the rectangles is the fraction of the data points falling in the cells.
If right = TRUE (default), the histogram cells are intervals of the form (a, b], i.e., they include their right-hand endpoint, but not their left one, with the exception of the first cell when include.lowest is TRUE.
For right = FALSE, the intervals are of the form [a, b), and include.lowest means ‘include highest’.
A numerical tolerance of 1e-7 times the median bin size is applied when counting entries on the edges of bins. This is not included in the reported breaks nor (as from R 2.11.0) in the calculation of density.
The default for breaks is "Sturges": see nclass.Sturges. Other names for which algorithms are supplied are "Scott" and "FD" / "Freedman-Diaconis" (with corresponding functions nclass.scott and nclass.FD). Case is ignored and partial matching is used. Alternatively, a function can be supplied which will compute the intended number of breaks or the actual breakpoints as a function of x.
Values
an object of class "histogram" which is a list with components:
Prior to R 3.0.0 there was a component intensities, the same as density, for long-term back compatibility.
- breaks
- the n+1 cell boundaries (=
breaksif that was a vector). These are the nominal breaks, not with the boundary fuzz. - counts
- n integers; for each cell, the number of
x[]inside. - density
- values f^(x[i]), as estimated density values. If
all(diff(breaks) == 1), they are the relative frequenciescounts/nand in general satisfy , where b[i] =breaks[i]. - mids
- the n cell midpoints.
- xname
- a character string with the actual
xargument name. - equidist
- logical, indicating if the distances between
breaksare all the same.
References
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988) The New S Language. Wadsworth & Brooks/Cole.
Venables, W. N. and Ripley. B. D. (2002) Modern Applied Statistics with S. Springer.
See Also
nclass.Sturges, stem, density, truehist in package MASS.
Typical plots with vertical bars are not histograms. Consider barplot or plot(*, type = "h") for such bar plots.
Examples
op <- par(mfrow = c(2, 2)) hist(islands) utils::str(hist(islands, col = "gray", labels = TRUE)) hist(sqrt(islands), breaks = 12, col = "lightblue", border = "pink") ##-- For non-equidistant breaks, counts should NOT be graphed unscaled: r <- hist(sqrt(islands), breaks = c(4*0:5, 10*3:5, 70, 100, 140), col = "blue1") text(r$mids, r$density, r$counts, adj = c(.5, -.5), col = "blue3") sapply(r[2:3], sum) sum(r$density * diff(r$breaks)) # == 1 lines(r, lty = 3, border = "purple") # -> lines.histogram(*) par(op) require(utils) # for str str(hist(islands, breaks = 12, plot = FALSE)) #-> 10 (~= 12) breaks str(hist(islands, breaks = c(12,20,36,80,200,1000,17000), plot = FALSE)) hist(islands, breaks = c(12,20,36,80,200,1000,17000), freq = TRUE, main = "WRONG histogram") # and warning require(stats) set.seed(14) x <- rchisq(100, df = 4) ## Comparing data with a model distribution should be done with qqplot()! qqplot(x, qchisq(ppoints(x), df = 4)); abline(0, 1, col = 2, lty = 2) ## if you really insist on using hist() ... : hist(x, freq = FALSE, ylim = c(0, 0.2)) curve(dchisq(x, df = 4), col = 2, lty = 2, lwd = 2, add = TRUE)
Documentation reproduced from R 3.0.1. License: GPL-2.
