The Binomial Distribution

This gives you three things. First, it covers the Binomial distribution. Second, it introduces you to the incredible resource that is Khan Academy. Third, it gives you a few videos that may help with understanding the Binomial distribution.

Note that the Binomial distribution is an example of a discrete distribution. The Binomial distribution is covered in Section 5.3 in the text (Eighth Edition). The Binomial distribution will be very important in the future, because it is the distribution that underlies any statistical inference about a population proportion. So, you should be familiar with the distribution, if not a master of it.

The Binomial Distribution


A simple coin can be used to illustrate the Binomial distribution. Flip the fair coin n times. The number of heads will have a Binomial(n, p=0.500) distribution.

A Bernoulli experiment is a single trial with two possible outcomes. The probability of a success is p (also known as the “success probability”). The probability of a failure is q = 1 − p.

If we perform a Bernoulli experiment n times and measure only the number of successes, we have a Binomial experiment. Technically, there are four requirements for an experiment to be a Binomial experiment. Those requirements are listed in your text on Page 266. These are another way of expressng the requirements:

The number of trials, n is fixed and known.
The outcome of one trial does not depend on the outcome of the others.
Each trial has two possible outcomes, success or failure.
The probability of a success is the same for each trial.

Note that different sources have a different number of requirements. Your textbook lists four requirements. Many others will list a fifth requirement: The random variable is the number of successes.

Check the following to see if each is a Binomial experiment. When you have decided for each, hover your mouse over the grey box to see the answer.

I flip a coin 100 times and measure the number of heads flipped.
This is a Binomial experiment. It meets all four requirements.
I pass through five stoplights on the way home and measure the number of times I have to stop.
This is not a Binomial experiment. Stopping at the lights are not independent events. If I stop at the first light, the probability of me having to stop at the next is reduced… unless the traffic engineer hates humanity, in which case it is higher. =)
The crime rate in a city of 100,000 people.
This is not a Binomial experiment. The number of trials is not known. A person can commit more than one crime in a given year. If we were measuring the number of people who commit a crime, then it would be a Binomial experiment.
The number of days this week that my business has more than 100 customers.
This is a Binomial experiment. It meets all four requirements.
The number of times it takes for me to flip 100 heads.
This is not a Binomial experiment. We do not know the number of trials (only the number of successes).
The number of stoplights I pass through before having to stop.
This is not a Binomial experiment. We do not know the number of trials (only the number of failures).
The recidivism rate in Oklahoma.
This is a Binomial experiment. The recidivism rate is measured as the number of people released who are returned to jail/prison.
The number of customers I have this week.
This is not a Binomial experiment. The number of trials is unknown… or the number of outcomes per trial is more than two. It depends on how you define “trial.”

Why this is Important

If we know we have a Binomial experiment, we know a lot about it. We know the probability of each possible outcome. We know what value is most likely. We know the average value (mean). Determining these things is a matter of using the correct formula (or the correct technology). The most important formulas are on pages 266 and 272. To practice them, click on the Project Scarlet link to the right.

Short Example

I have a fair coin. I flip that coin 10 times. What is the probability of getting exactly 7 heads in those 10 flips? Note that the description tells us $n = 10$ and $p = 0.500$.

Let us use Excel to answer this question. In a blank cell, type =BINOM.DIST(7,10,0.500,FALSE). Once you hit the Enter key, Excel tell you that the probability of getting exactly 7 heads is 0.1771875.

The Excel formula has four slots. The first is for the number in the probability statement. Here, since we were calculating P[X = 7], this number is 7. The second is n, the number of trials (coin flips). The third is p, the success probability (probability of a head on each flip). The fourth is FALSE if you are calculating an = probability and TRUE if you are calculating an ≤ probability. As this probability statement was P[X = 7], we used FALSE.

Now, for practice, calculate these five probabilities:

What is the probability of getting exactly 3 heads?
P[H = 3] = 0.1171875. The Excel formula is =BINOM.DIST(3,10,0.5,FALSE)
What is the probability of getting exactly 1 head?
P[H = 1] = 0.009765625. The Excel formula is =BINOM.DIST(1,10,0.5,FALSE)
What is the probability of getting two or fewer heads?
P[H ≤ 2] = 0.0546875. The Excel formula is =BINOM.DIST(2,10,0.5,TRUE)
What is the probability of getting 7 or more heads?
P[H ≥ 7] = 1 − P[H ≤ 6] = 0.171875. The Excel formula is =1 − BINOM.DIST(6,10,0.5,TRUE)
What is the probability of getting more than 6 heads?
P[H > 6] = 1 − P[H ≤ 6] = 0.171875. The Excel formula is =1 − BINOM.DIST(6,10,0.5,TRUE)

Here are some hints. The following are the above written in symbols:

P[H = 3]
P[H = 1]
P[H ≤ 2]
P[H ≥ 7]
P[H > 6] = P[H ≥ 7]

That’s it! The computer did the calculations, as it needed to do.

Remember: Statistics is more about the interpretation than it is about the calculations. Get the computer to do the calculations for you. Spend your brain power on interpreting the results.

In a couple of the above examples, the inequality is either > or ≥. Since Excel requires the probabilities to be =, < or ≤, we used the Rule of Complements (Page 229) to perform the calculations. You will see that rule quite frequently, so make sure you understand it.

Uses of the Binomial

In my experience as a statistician, I have used the Binomial probability to model many processes. These processes range from presidential elections to traffic light placement. Once I knew I had a Binomial experiment, I could ignore a lot and focus on estimating the two parameters: n and p. Usually, n is easy to determine; it is the number of trials. Also usually, p is hard to determine. A lot of time goes into estimating the success probability, p.

Once I have determined n and p, I know everything about the process. For instance, I know that the expected number of successes is $np$ (see Page 272). I know that the variance in the number of successes is $npq$ and the standard deviation is $\sqrt{npq}$ (see Page 272). I know all of the probabilities of all possible outcomes and combination of outcomes. There is a formula available for this (Page 266), a table (Table A.1), and several calculators (e.g., StatTrek). The difficulty is only in estimating p, which is beyond the scope of this part, but will be covered in Part III: Inference.

The Khan Academy


The logo for the Khan Academy, one of the better sources of education on the Internet.

The Internet is replete with non-profit organizations. Many are focused on bringing free education to the entire world. The Khan Academy is one such organization. On their “About” page, they state:

Khan Academy is an organization on a mission. We’re a not-for-profit with the goal of changing education for the better by providing a free world-class education for anyone anywhere.

All of the site’s resources are available to anyone. It doesn’t matter if you are a student, teacher, home-schooler, principal, adult returning to the classroom after 20 years, or a friendly alien just trying to get a leg up in earthly biology. Khan Academy’s materials and resources are available to you completely free of charge.

I suggest you bookmark their homepage and avail yourself of their videos. They tend to be quite excellent.

Useful Technology Videos

Usually, I will include some videos in these mini-lectures. These videos will tend to offer some additional audio/visual help in performing typical problems from the chapter. Sometimes, they are theoretical. Sometimes, they are technical in that they show you how to perform calculations in a variety of ways.

And so, here are some videos dealing with Binomial probabilities. Again, let the computer perform the calculations. You should spend your time thinking about Binomial distributions and probabilities. There is a reason statisticians went to the trouble of naming this distribution. We come across it frequently in our experiences.

Excel

These two videos show how to perform the sample statistics calculations discussed above. They are specifically for Microsoft Excel.

In addition to these two videos, there is a large number of videos on YouTube for calculating Binomial probabilities in Excel. The following search link will take you to YouTube and provide you with a non-exhaustive list: Binomial probabilities in Excel.

That is it. In this mini-lecture, we looked at the Binomial distribution and what it can be used for. We saw how to perform probability calculations using Excel. The Binomial distribution is very helpful when trying to better understand proportions. In fact, when we get to Part 3, we will rely heavily on all we know about the Binomial distribution to estimate that population proportion, p.