In probability theory, a Poisson process is a stochastic process that counts the number of events and the time points at which these events occur in a given time interval. The time between each pair of consecutive events has an exponential distribution with parameter \(\lambda\) and each of these inter-arrival times is assumed to be independent of other inter-arrival times. The process is named after the French mathematician Siméon Denis Poisson and is a good model of radioactive decay, telephone calls and requests for a particular document on a web server among many other phenomena.

The Poisson process is a continuous-time process; the sum of a Bernoulli process can be thought of as its discrete-time counterpart. A Poisson process is a pure-birth process, the simplest example of a birth-death process. It is also a point process on the real half-line.

Probability of getting \(5\) e-mails (n Poisson events) in \(1\) day, given an average rate. Mnemonic: Probability of getting \(n\) fishes - “poissons” in French.


Probability of having to wait \(1\) hour before the next e-mail arrives (Poisson event). Parameter:

Describes the time between events in a Poisson process, i.e. a process in which events occur continuously and independently at a constant average rate. It is the continuous analogue of the geometric distribution (number of Bernoulli trials before getting the first success, e.g. Heads), and it has the key property of being memoryless.


Probability of having to wait \(1\) day to get \(5\) e-mails (time to \(n\)-th Poisson event). We add inter-arrival times (exponential lambda). This fact: that the gamma distribution represents the sum of exponential distributions (convolution) is proved by deriving the mgf of the gamma.

Parameters: \(n\) (\(a\) or \(K\) or \(\alpha\)) (shape) and \(\lambda\) (or \(\beta\) or \(1/\theta\)) (scale). It’s the continuous analogue of the negative binomial (i.e. sum of geometric) = number of Bernoulli trials before reaching n successes (eg. \(4\) heads).


The multinomial distribution is a generalization of the binomial. In the binomial, there are \(n\) independent trials or experiments, and we add the number of successes. In each trial there are only two possibilities: success or failure - each trial or experiment is a Bernouilli trial.

In a multinomial distribution there are also \(n\) experiments, but the outcome of each experiment is not S or F, but rather \(K\) possible categories. For instance, we survey \(n=15\) people, asking them whether they intend to vote Democrat, Republican or Independent, i.e. \(K = 3.\) Knowing the percentage of supporters for each option in the general popoulation we can calculate the probability of the event \((D=7, R=5, I=3).\)

If \(K=2\) we are back to \((0,1)\) binary outcomes with \(n\) experiments, which is the definition of the binomial distribution.

If \(n=1\) and \(K = 2,\) we have a Bernouilli experiment.

If \(n=1\) but we have more than \(2\) categories, we are dealing with a categorical distribution. The categorical distribution is the generalization of the Bernoulli distribution for a categorical random variable, i.e. for a discrete variable with more than two possible outcomes, such as the roll of a die, \(K=6\). On the other hand, the categorical distribution is a special case of the multinomial distribution, in that it gives the probabilities of potential outcomes of a single drawing, \((n=1),\) rather than multiple drawings.

The parameters specifying the probabilities of each possible outcome are constrained only by the fact that each must be in the range 0 to 1, and all must sum to 1.

Home Page