Zero-inflated model

From Infogalactic: the planetary knowledge core
Jump to: navigation, search

In statistics, a zero-inflated model is a statistical model based on a zero-inflated probability distribution, i.e. a distribution that allows for frequent zero-valued observations.

Zero-inflated Poisson

The first zero-inflated model is zero-inflated Poisson model. The zero-inflated Poisson model concerns a random event containing excess zero-count data in unit time.[1] For example, the number of insurance claims within a population for a certain type of risk would be zero-inflated by those people who have not taken out insurance against the risk and thus are unable to claim. The zero-inflated Poisson (ZIP) model employs two components that correspond to two zero generating processes. The first process is governed by a binary distribution that generates structural zeros. The second process is governed by a Poisson distribution that generates counts, some of which may be zero. The two model components are described as follows:

 \Pr (y_j = 0) = \pi + (1 - \pi) e^{-\lambda}
\Pr (y_j = h_i) = (1 - \pi) \frac{\lambda^{h_i} e^{-\lambda}} {h_i!},\qquad h_i \ge 1

where the outcome variable y_j has any non-negative integer value, \lambda_i is the expected Poisson count for the ith individual; \pi is the probability of extra zeros.

The mean is  (1-\pi) \lambda and the variance is   \lambda (1-\pi) (1+\lambda \pi) .

Estimators of ZIP

The method of moments estimators are given by

 \hat{\lambda}_{mo} = \frac{s^2+m^2-m}{m},

 \hat{\pi}_{mo} = \frac{s^2 - m}{s^2 + m^2 - m},

where m is the sample mean and s^2 is the sample variance.

The maximum likelihood estimator[2] can be found by solving the following equation

 \bar{x}(1- e^{-\hat{\lambda}_{ml}}) = \hat{\lambda}_{ml} \left( 1 - \frac{n_0}{n} \right).

Where  \bar{x} is the sample mean, and  \frac{n_0}{n} is the observed proportion of zeros.

This can be solved by iteration,[3] and the maximum likelihood estimator for \pi is given by

 \hat{\pi}_{ml} = 1 - \frac{\bar{x}}{\hat{\lambda}_{ml}}.

Related models

1994, Greene considered the zero-inflated negative binomial (ZINB) model.[4] Daniel B. Hall adapted Lambert's methodology to an upper-bounded count situation, thereby obtaining a zero-inflated binomial (ZIB) model.[5]

Discrete pseudo compound Poisson model

If the count data Y with the feature that the probability of zero is larger than the probability of nonzero, namely

 \Pr (Y = 0) > 0.5

then the discrete data Y obey discrete pseudo compound Poisson distribution.[6]

In fact, let {G(z)}= \sum\limits_{n = 0}^\infty P(Y = n)z^n be the probability generating function of y_i. If  p_0=\Pr (Y = 0) > 0.5 , then \left| {G(z)} \right| \geqslant {p_0} - \sum\limits_{i = 1}^\infty  {{p_i}}  = 2{p_0}-1 > 0. Then from Wiener–Lévy theorem,[7] we show that {G(z)} have the probability generating function of discrete pseudo compound Poisson distribution.

We say that the discrete random variable Y satisfying probability generating function characterization

 G_Y(z) = \sum\limits_{n = 0}^\infty P(Y = n)z^n  = \exp\left(\sum\limits_{k = 1}^\infty  \alpha_k \lambda (z^k - 1)\right), \quad (|z| \le 1)

has a discrete pseudo compound Poisson distribution with parameters (\lambda_1 ,\lambda_2, \ldots )=(\alpha_1 \lambda,\alpha_2 \lambda, \ldots ) \in \mathbb{R}^\infty \left( {\sum\limits_{k = 1}^\infty  {{\alpha _k}}  = 1,\sum\limits_{k = 1}^\infty  {\left| {{\alpha _k}} \right|} < \infty ,{\alpha _k} \in {\mathbb{R}},\lambda  > 0} \right).

When all the \alpha_k are non-negative, it is the discrete compound Poisson distribution(non-Poisson case) with overdispersion property.

See also

References

  1. Lua error in package.lua at line 80: module 'strict' not found.
  2. Lua error in package.lua at line 80: module 'strict' not found.
  3. Lua error in package.lua at line 80: module 'strict' not found.
  4. Lua error in package.lua at line 80: module 'strict' not found.
  5. Lua error in package.lua at line 80: module 'strict' not found.
  6. Lua error in package.lua at line 80: module 'strict' not found.
  7. Lua error in package.lua at line 80: module 'strict' not found.