The Poisson distribution is popular for modeling the number of times an event occurs in an interval of time or space. A discrete random variable X is said to have a Poisson distribution with parameter u> 0, if, for k = 0, 1, 2, …, the probability mass function of X is given by:

The Poisson distribution is an appropriate model if the following assumptions are true: 1.k is the number of times an event occurs in an interval and k can take values 0, 1, 2, …. 2. The occurrence of one event does not affect the probability that a second event will occur. That is, events occur independently. 3. The average rate at which events occur is constant. 4. Two events cannot occur at exactly the same instant; instead, at each very small sub-interval, exactly one event either occurs or does not occur or the actual probability distribution is given by a binomial distribution and the number of trials is sufficiently bigger than the number of successes one is asking about. If these conditions are true, then k is a Poisson random variable, and the distribution of k is a Poisson distribution. Now let us answer the question that was asked at the beginning of the notebook. Mylie has been averaging 3 hits for every 10 times at-bat. What is the probability that she will get exactly 2 hits in her next 5 times at bat? Since the formula is:

Let us use the scipy.stats.poisson.pmf function to further drive home the concept.

In [17]:`from scipy.stats import poisson import matplotlib.pyplot as plt`

The probability mass function for poisson is:

poisson.pmf(k) = exp(-mu) * mu**k / k! for k >= 0.

poisson takes mu as shape parameter(mu is the mean/expected value /variance).

The probability mass function above is defined in the “standardized” form. To shift distribution use the loc parameter. Specifically, poisson.pmf(k, mu, loc) is identically equivalent to poisson.pmf(k — loc, mu).

In [18]:`#Calculate a few first moments: mu = 1.5 mean, var, skew, kurt = poisson.stats(mu, moments='mvsk') print('Mean=%.3f,Variance=%.3f'%(mean,var) )`

Mean=1.500,Variance=1.500

In [19]:`#pmf(x, mu, loc=0) Probability mass function. #Use the Probability mass function to calculate P(X=2) p= poisson.pmf(2,1.5) p`

Out[19]:

0.25102143016698353

We got the same answer as above when we did it by hand. Let us display the probability mass function (pmf) for k >= 0 and < 5:

In [20]:`import numpy as np fig, ax = plt.subplots(1, 1) x = np.arange(0,5) mu = 1.5 ax.plot(x, poisson.pmf(x, mu), 'bo', ms=8, label='poisson pmf') ax.vlines(x, 0, poisson.pmf(x, mu), colors='b', lw=5, alpha=0.5) plt.show() #Freeze the distribution and display the frozen pmf: rv = poisson(mu) ax.vlines(x, 0, rv.pmf(x), colors='k', linestyles='-', lw=1, label='frozen pmf') ax.legend(loc='best', frameon=False) plt.show()`

In [21]:`x`

Out[21]:

array([0, 1, 2, 3, 4])

In [22]:`#Check accuracy of cdf and ppf: prob = poisson.cdf(x, mu) np.allclose(x, poisson.ppf(prob, mu))`

Out[22]:

True

In [23]:`#Generate random numbers: import seaborn as sb r = poisson.rvs(mu, size=1000) ax = sb.distplot(r, kde=True, color='green', hist_kws={"linewidth": 25,'alpha':1}) ax.set(xlabel='X=No of Outcomes', ylabel='Probability')`

Out[23]:

[Text(0, 0.5, ‘Probability’), Text(0.5, 0, ‘X=No of Outcomes’)]

## Methods

## Examples that violate the Poisson assumptions

1.Haight, Frank A. (1967), Handbook of the Poisson Distribution, New York, NY, USA: John Wiley & Sons, ISBN 978–0–471–33932–8

2.Koehrsen, William (2019–01–20), The Poisson Distribution and Poisson Process Explained, Towards Data Science, retrieved 2019–09–19

3.Scipy stats