One of the shortcomings of the Poisson distribution is that its variance exactly equals its mean. It is common in practice for the variance of count data to be larger than the mean, so it’s natural to look for a distribution like the Poisson but with larger variance. We start with a Poisson random variable X with mean λ, but then we make λ itself random and suppose that λ comes from a gamma(α, β) distribution. Then the marginal distribution on X is a negative binomial distribution with parameters r = α and p = 1/(β + 1).
The previous post said that the negative binomial is useful because it has more variance than the Poisson. The derivation above explains why the negative binomial should have more variance than the Poisson.