Description Usage Arguments Details Value Warning Note Author(s) References See Also Examples

Family function for a hypergeometric distribution where either the number of white balls or the total number of white and black balls are unknown.

1 |

`N` |
Total number of white and black balls in the urn.
Must be a vector with positive values, and is recycled, if necessary,
to the same length as the response.
One of |

`D` |
Number of white balls in the urn.
Must be a vector with positive values, and is recycled, if necessary,
to the same length as the response.
One of |

`lprob` |
Link function for the probabilities.
See |

`iprob` |
Optional initial value for the probabilities. The default is to choose initial values internally. |

Consider the scenario from
`dhyper`

where there
are *N=m+n* balls in an urn, where *m* are white and *n*
are black. A simple random sample (i.e., *without* replacement) of
*k* balls is taken.
The response here is the sample *proportion* of white balls.
In this document,
`N`

is *N=m+n*,
`D`

is *m* (for the number of “defectives”, in quality
control terminology, or equivalently, the number of marked individuals).
The parameter to be estimated is the population proportion of
white balls, viz. *prob = m/(m+n)*.

Depending on which one of `N`

and `D`

is inputted, the
estimate of the other parameter can be obtained from the equation
*prob = m/(m+n)*, or equivalently, `prob = D/N`

. However,
the log-factorials are computed using `lgamma`

and both *m* and *n* are not restricted to being integer.
Thus if an integer *N* is to be estimated, it will be necessary to
evaluate the likelihood function at integer values about the estimate,
i.e., at `trunc(Nhat)`

and `ceiling(Nhat)`

where `Nhat`

is the (real) estimate of *N*.

An object of class `"vglmff"`

(see `vglmff-class`

).
The object is used by modelling functions such as
`vglm`

,
`vgam`

,
`rrvglm`

,
`cqo`

,
and `cao`

.

No checking is done to ensure that certain values are within range,
e.g., *k <= N*.

The response can be of one of three formats: a factor (first
level taken as success), a vector of proportions of success,
or a 2-column matrix (first column = successes) of counts.
The argument `weights`

in the modelling function can also be
specified. In particular, for a general vector of proportions,
you will need to specify `weights`

because the number of
trials is needed.

Thomas W. Yee

Forbes, C., Evans, M., Hastings, N. and Peacock, B. (2011).
*Statistical Distributions*,
Hoboken, NJ, USA: John Wiley and Sons, Fourth edition.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 | ```
nn <- 100
m <- 5 # Number of white balls in the population
k <- rep(4, len = nn) # Sample sizes
n <- 4 # Number of black balls in the population
y <- rhyper(nn = nn, m = m, n = n, k = k)
yprop <- y / k # Sample proportions
# N is unknown, D is known. Both models are equivalent:
fit <- vglm(cbind(y,k-y) ~ 1, hyperg(D = m), trace = TRUE, crit = "c")
fit <- vglm(yprop ~ 1, hyperg(D = m), weight = k, trace = TRUE, crit = "c")
# N is known, D is unknown. Both models are equivalent:
fit <- vglm(cbind(y, k-y) ~ 1, hyperg(N = m+n), trace = TRUE, crit = "l")
fit <- vglm(yprop ~ 1, hyperg(N = m+n), weight = k, trace = TRUE, crit = "l")
coef(fit, matrix = TRUE)
Coef(fit) # Should be equal to the true population proportion
unique(m / (m+n)) # The true population proportion
fit@extra
head(fitted(fit))
summary(fit)
``` |

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.