4.4 The Binomial Distribution

LEARNING OBJECTIVES

  • Recognize the binomial probability distribution and apply it appropriately.

There are four characteristics of a binomial experiment:

  1. There are a fixed number of trials. Think of trials as repetitions of an experiment.  The letter [latex]n[/latex] denotes the number of trials.
  2. There are only two possible outcomes, called “success” and “failure,” for each trial.  The letter [latex]p[/latex] denotes the probability of a success on any one trial and [latex]1-p[/latex] denotes the probability of a failure on one trial.
  3. The [latex]n[/latex] trials are independent and are repeated using identical conditions.
  4. For each individual trial, the probability of a success, [latex]p[/latex], and probability of a failure, [latex]1-p[/latex], remain the same.  Because the [latex]n[/latex] trials are independent, the outcome of one trial does not affect the outcome of another trial.

For example, randomly guessing at a true-false statistics question has only two outcomes.  If a success is guessing correctly, then a failure is guessing incorrectly.  Suppose Joe always guesses correctly on any statistics true-false question with probability [latex]p=0.6[/latex].  Then, [latex]1-p=0.4[/latex].  This means that for every true-false statistics question Joe answers, his probability of success [latex]p=0.6[/latex] and his probability of failure [latex]1-p=0.4[/latex] remain the same.

The outcomes of a binomial experiment fit a binomial probability distribution.  The random variable [latex]X[/latex] is the number of successes obtained in the [latex]n[/latex] independent trials.  The mean of a binomial probability distribution is [latex]\displaystyle{\mu=n \times p}[/latex] and the standard deviation is [latex]\displaystyle{\sigma=\sqrt{n \times p \times (1-p)}}[/latex]

Any experiment with the characteristics of a binomial experiment and where [latex]n=1[/latex] is called a Bernoulli Trial (named after Jacob Bernoulli who, in the late 1600s, studied them extensively).  A binomial experiment takes place when the number of successes is counted in one or more Bernoulli Trials.

EXAMPLE

At ABC College, the withdrawal rate from an elementary physics course is 30% for any given term.  This implies that, for any given term, 70% of the students stay in the class for the entire term.  A “success” could be defined as an individual who withdrew from the course.  The random variable [latex]X[/latex] is the number of students who withdraw from the randomly selected elementary physics class.

TRY IT

The state health board is concerned about the amount of fruit available in school lunches.  48% of schools in the state offer fruit in their lunches every day.  This implies that 52% do not.  What would a “success” be in this case?

 

Click to see Solution
  • A success would be a school that offers fruit in their lunch every day.

EXAMPLE

Suppose you play a game that you can only either win or lose.  The probability that you win any game is 55% and the probability that you lose is 45%.  Each game you play is independent.  If you play the game 20 times, write the function that describes the probability that you win 15 of the 20 times.

Solution:

If you define [latex]X[/latex] as the number of wins, then [latex]X[/latex] takes on the values 0, 1, 2, 3, …, 20.  The probability of a success is [latex]p=0.55[/latex].  The probability of a failure is [latex]1-p=0.45[/latex].  The number of trials is [latex]n=20[/latex].  The probability question can be stated mathematically as [latex]\displaystyle{P(x=15)}[/latex].

EXAMPLE

Approximately 70% of statistics students do their homework in time for it to be collected and graded.  Each student does homework independently.  In a statistics class of 50 students, what is the probability that at least 40 will do their homework on time?  Students are selected randomly.

  1. This is a binomial problem because there is only a success or a __________, there are a fixed number of trials, and the probability of a success is 0.70 for each trial.
  2. If we are interested in the number of students who do their homework on time, then how do we define [latex]X[/latex]?
  3. What values does [latex]x[/latex] take on?
  4. What is a “failure,” in words?
  5. What is the probability of “failuere”?
  6. The words “at least” translate as what kind of inequality for the probability question [latex]\displaystyle{P(x....40)}[/latex].

Solution:

  1. failure
  2. [latex]X[/latex] is the number of statistics students who do their homework on time.
  3. 0, 1, 2, …, 50
  4. Failure is defined as a student who does not complete his or her homework on time.
  5. [latex]1-p=0.30[/latex]
  6. “At least” means greater than or equal to ([latex]\geq[/latex]). The probability question is [latex]\displaystyle{P(x\geq40)}[/latex].

TRY IT

Sixty-five percent of people pass the state driver’s exam on the first try.  A group of 50 individuals who have taken the driver’s exam is randomly selected.  Why this is a binomial problem?

 

Click to see Solution
  • There are only two outcomes on any exam (pass or fail).
  • There is fixed number of trials ([latex]n=50[/latex]).
  • The probability of pass ([latex]65\%[/latex]) is the same for each trial.
  • The trials are independent. (The fact that any one person passes or fails the exam does not affect whether or not any other person passes or fails.)

EXAMPLE

The following example illustrates a problem that is not binomial.  It violates the condition of independence.  ABC College has a student advisory committee made up of ten staff members and six students.  The committee wishes to choose a chairperson and a recorder. What is the probability that the chairperson and recorder are both students?

Solution:

The names of all committee members are put into a box and two names are drawn without replacement.  The first name drawn determines the chairperson and the second name the recorder.  There are two trials.  However, the trials are not independent because the outcome of the first trial affects the outcome of the second trial.  The probability of a student on the first draw is [latex]\displaystyle{\frac{6}{16}}[/latex] and the probability of a student on the second draw is [latex]\displaystyle{\frac{5}{15}}[/latex]. The probability of drawing a student’s name changes for each of the trials and, therefore, violates the condition of independence.

TRY IT

A lacrosse team is selecting a captain.  The names of all the seniors are put into a hat and the first three that are drawn will be the captains.  The names are not replaced once they are drawn (one person cannot be two captains).  You want to see if the captains all play the same position.  State whether or not this is binomial and state why.

 

Click to see Solution

 

This is not binomial because the names are not replaced after each draw, which means the probability changes for each time a name is drawn.  This violates the condition of independence.

Calculating Binomial Probabilities

CALCULATING BINOMIAL PROBABILITIES IN EXCEL

To calculate probabilities associated with binomial random variables in Excel, use the binom.dist(x,n,p,logic operator) function.

  • For x, enter the number of successes.
  • For n, enter the number of trials.
  • For p, enter the probability of success.
  • For the logic operator, enter false to find the probability of exactly x successes and enter true the find the probability of at most (less than or equal to) x successes.

The output from the binom.dist function is:

  • the probability of getting exactly x success in n trials with a probability of success p when the logic operator is false.
  • the probability of at most x successes in n trials with a probability of success p when the logic operator is true.

Visit the Microsoft page for more information about the binom.dist function.

NOTE

Because we can only enter false or true into the logic operator, the binom.dist function can only directly calculate the probability of getting exactly x successes in n trials or getting at most x success in n trials.  In order to calculate other binomial probabilities, such as fewer than x successes, more than x successes or at least x successes, we need to manipulate how we use the binom.dist function by changing what we enter into the binom.dist function, using the complement rule, or both.

EXAMPLE

It has been stated that about 41% of adult workers have a high school diploma but do not pursue any further education.  Suppose 20 adult workers are randomly selected.

  1. How many adult workers in the sample do you expect to have a high school diploma but do not pursue any further education?
  2. What is the probability that exactly 8 of the workers in the sample have a high school diploma but do not pursue further education?
  3. What is the probability that at most 12 of the workers in the sample have a high school diploma but do not purse further education?

Solution:

Let [latex]X[/latex] be the number of workers in the sample who have a high school diploma but do not pursue further education.  The number of trials is [latex]n=20[/latex] and the probability of success is [latex]p=0.41[/latex].

1. [latex]\displaystyle{\mu=n \times p=20 \times 0.41=8.2}[/latex].  On average, in any sample of 20 workers, 8.2 have a high school diploma but do not pursue further education.

2. We want to find [latex]\displaystyle{P(x=8)}[/latex].

Function binom.dist Answer
Field 1 8 0.1790
Field 2 20
Field 3 0.41
Field 4 false

The probability that exactly 8 of the workers in the sample have a high school diploma but do not pursue further education is [latex]17.9\%[/latex].

3. We want to find [latex]\displaystyle{P(x \leq 12)}[/latex].

Function binom.dist Answer
Field 1 12 0.9738
Field 2 20
Field 3 0.41
Field 4 true

The probability that at most 12 of the workers in the sample have a high school diploma but do not pursue further education is [latex]97.38\%[/latex].

TRY IT

About 32% of students participate in a community volunteer program outside of school.  Suppose 30 students are selected at random.

  1. What is the expected number of students in the sample that participate in a community volunteer program?
  2. What is the probability that exactly 10 of the students in the sample participate in a community volunteer program?
  3. What is the probability that at most 14 of the students in the sample participate in a community volunteer program?
Click to see Solution
  1. [latex]\displaystyle{\mu=n \times p=30 \times 0.32=9.6}[/latex]
  2. Function binom.dist Answer
    Field 1 10 0.1512
    Field 2 30
    Field 3 0.32
    Field 4 false
  3. Function binom.dist Answer
    Field 1 14 0.9695
    Field 2 30
    Field 3 0.32
    Field 4 true

EXAMPLE

In the 2013 Jerry’s Artarama art supplies catalog, there are 560 pages and 1.5% of the pages feature signature artists.  Suppose 100 pages are randomly selected from the catalog.

  1. What is the probability that fewer than 3 of the pages in the sample feature signature artists?
  2. What is the probability that more than 5 of the pages in the sample feature signature artists?
  3. What is the probability that at least 4 of the pages in the sample feature signature artists?
  4. What is the probability that between 2 and 6 of the pages in the sample feature signature artists?

Solution:

  1. We want to find [latex]\displaystyle{P(x \lt 3)}[/latex].  We cannot find this probability directly in Excel because the binom.dist function can only calculate [latex]=[/latex] or [latex]\leq[/latex] probabilities.  Because [latex]x[/latex] must be an integer (it is the number of pages), [latex]x \lt 3[/latex] is the same as [latex]x \leq 2[/latex] (of course, in general, this is not true).  So [latex]\displaystyle{P(x \lt 3)=P(x \leq 2)}[/latex] and [latex]\displaystyle{P(x \leq 2)}[/latex] is a probability we can calculate with the binom.dist functon.
    Function binom.dist Answer
    Field 1 2 0.8098
    Field 2 100
    Field 3 0.015
    Field 4 true
  2.  We want to find [latex]\displaystyle{P(x \gt 5)}[/latex].   We cannot find this probability directly in Excel because the binom.dist function can only calculate [latex]=[/latex] or [latex]\leq[/latex] probabilities.  The complement of [latex]\gt[/latex] is [latex]\leq[/latex], so [latex]\displaystyle{P(x \gt 5)=1-P(x \leq 5)}[/latex] and [latex]\displaystyle{P(x \leq 5)}[/latex] is a probability we can calculate with the binom.dist function.
    Function 1-binom.dist Answer
    Field 1 5 0.0177
    Field 2 100
    Field 3 0.015
    Field 4 true
  3. We want to find [latex]\displaystyle{P(x \geq 4)}[/latex].  We cannot find this probability directly in Excel because the binom.dist function can only calculate [latex]=[/latex] or [latex]\leq[/latex] probabilities.  The complement of [latex]\geq[/latex] is [latex]\lt[/latex], so [latex]\displaystyle{P(x \geq 4)=1-P(x  \lt 4)}[/latex].  Because [latex]x[/latex] must be an integer (it is the number of pages), [latex]x \lt 4[/latex] is the same as [latex]x \leq 3[/latex].  So [latex]\displaystyle{P(x \geq 4)=1-P(x  \lt 4)=1-P(x \leq 3)}[/latex] and [latex]\displaystyle{P(x \leq 3)}[/latex] is a probability we can calculate with the binom.dist function.
    Function 1-binom.dist Answer
    Field 1 3 0.0642
    Field 2 100
    Field 3 0.015
    Field 4 true
  4.  We want to find [latex]\displaystyle{P(2 \leq x \leq 6)}[/latex].  We cannot find this probability directly in Excel because the binom.dist function can only calculate [latex]=[/latex] or [latex]\leq[/latex] probabilities.  But, [latex]\displaystyle{P(2 \leq x \leq 6)=P(x \leq 6)-P(x \leq 1)}[/latex].  So we can calculate [latex]\displaystyle{P(2 \leq x \leq 6)}[/latex] as the difference of two binom.dist functions.
Function binom.dist -binom.dist Answer
Field 1 6 1 0.4426
Field 2 100 100
Field 3 0.015 0.015
Field 4 true true

TRY IT

According to a Gallup poll, 60% of American adults prefer saving over spending.  Suppose 50 American adults are selected at random.

  1. What is the probability that at least 35 adults in the sample prefer saving over spending?
  2. What is the probability that fewer than 20 adults in the sample prefer saving over spending?
  3. What is the probability between 15 and 25 adults in the sample prefer saving over spending?
  4. What is the probability that more than 30 adults prefer saving over spending?
Click to see Solution
  1. Function 1-binom.dist Answer
    Field 1 34 0.0955
    Field 2 50
    Field 3 0.6
    Field 4 true
  2. Function binom.dist Answer
    Field 1 19 0.0014
    Field 2 50
    Field 3 0.6
    Field 4 true
  3. Function binom.dist -binom.dist Answer
    Field 1 25 14 0.0978
    Field 2 50 50
    Field 3 0.6 0.6
    Field 4 true true
  4. Function 1-binom.dist Answer
    Field 1 30 0.4465
    Field 2 50
    Field 3 0.6
    Field 4 true

TRY IT

During the 2013 regular NBA season, DeAndre Jordan of the Los Angeles Clippers had the highest field goal completion rate in the league.  DeAndre scored with 61.3% of his shots.  Suppose you choose a random sample of 80 shots made by DeAndre during the 2013 season.

  1. What is the expected number shots that scored points in a sample of 80 of DeAndre’s shots?
  2. What is the probability that DeAndre scored on 60 of the 80 shots?
  3. What is the probability that DeAndre scored on more than 50 of the 80 shots?
  4. What is the probability that DeAndre scored on between 65 and 75 of the 80 shots?
Click to see Solution
  1. [latex]\mu=n \times p=80 \times 0.613=49.04[/latex]
  2. Function binom.dist Answer
    Field 1 60 0.0036
    Field 2 80
    Field 3 0.613
    Field 4 false
  3. Function 1-binom.dist Answer
    Field 1 50 0.3718
    Field 2 80
    Field 3 0.613
    Field 4 true
  4. Function binom.dist -binom.dist Answer
    Field 1 75 64 0.0001
    Field 2 80 80
    Field 3 0.613 0.613
    Field 4 true true

Watch this video: Binomial Probability in Excel by Joshua Emmanuel [6:59]


Concept Review

A statistical experiment can be classified as a binomial experiment if the following conditions are met:

  1. There are a fixed number of trials, [latex]n[/latex].
  2. There are only two possible outcomes, called “success” and, “failure” for each trial.  The letter [latex]p[/latex] denotes the probability of a success on one trial and [latex]1-p[/latex] denotes the probability of a failure on one trial.
  3. The [latex]n[/latex] trials are independent and are repeated using identical conditions.
  4. For each individual trial, the probability of a success, [latex]p[/latex], and probability of a failure, [latex]1-p[/latex], remain the same.

The outcomes of a binomial experiment fit a binomial probability distribution.  The random variable [latex]X[/latex] is the number of successes obtained in the [latex]n[/latex] independent trials.  The mean of a binomial distribution is [latex]\mu=n \times p[/latex] and the standard deviation is [latex]\sigma=\sqrt{n\times p \times (1-p)}[/latex].


Attribution

4.3 Binomial Distribution in Introductory Statistics by OpenStax is licensed under a Creative Commons Attribution 4.0 International License.

License

Icon for the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License

Introduction to Statistics Copyright © 2022 by Valerie Watts is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License, except where otherwise noted.