This provides an example of how (and why) to perform the analysis of variance (ANOVA) procedure. The structure follows a scientific method to emphasize the underlying structure to research. This example follows Section 11.2.

The Research Question

[Sonic in Stillwater]

The first Sonic drive-in, Stillwater, OK. Photo courtesy Waymark. The sign is still there, but that Sonic (on Main) was renovated last year.

The research question is a question that frames your interest in broad terms. It ends in a question mark and should be interesting to someone. For this example, the student was completing an assignment for me. There was no interest.

Which drive-through joint in Stillwater, OK, is fastest at lunch time?

The Research Hypothesis

The research hypothesis is a proposed answer to the research question. From experience, we decided that there would be no difference in the average times. Translated into symbols, this is

μM = μS = μT

The population parameter is μ, the population mean time. The equality sign is “equal to.” The three groups are the times at the three restaurants (hence the subscripts).

The Null Hypothesis

Remember that the research hypothesis is what the scientist cares about… the only thing. However, because of probability and the randomness of life, statisticians need two other hypotheses: the null and the alternative. For this course, the null hypothesis is always the same as the research hypothesis, but with the (in)equality changed to an equality. Thus, the null hypothesis here is

H0 : μM = μS = μT

The Alternative Hypothesis

The alternative hypothesis is either the research hypothesis or its opposite. If there is an equals part to the research hypothesis, then the alternative hypothesis is the opposite of the research hypothesis. If there is no equals part, then the alternative hypothesis is the research hypothesis. Thus, for this example,

H1 : μM ≠ μS or μM ≠ μT or μS ≠ μT

That is a rather difficult alternative hypothesis to write out in symbols. Note what it is saying, however: The alternative is that the three means are not all the same. In other words, at least one differs from the rest.

Planning

[Rosa Parks]

Rosa Parks grilling at McDonald’s (1980s). Photo courtesy the Library of Congress.

Now that we have our null hypothesis and a better understanding of the processes involved in creating the data, we can explicitly write our plan, which allows others to replicate our work.

Being a megalopolis, Stillwater has three fast food joints: McDonalds, Sonic, and Taco Bell. The students wanted to know which of the three was fastest in filling orders. To determine this, they went through the drive-throughs four times each. They would order one item off the value menu, thinking that this would even out the preparation times. They went at 1:00pm each day, which they figured would take care of the traffic variation.

Note that they worked to minimize the effect of extraneous sources of variation. This allows them to better estimate the effect of the restaurant. Thus, here is the plan:

  1. At 1pm, go to the fast food restaurant.
  2. Order an item off the dollar menu.
  3. Once the order is spoken, start the timer.
  4. When you receive the order at the window, stop the timer.
  5. Record the elapsed time.

The second aspect of planning is planning the analysis. Here, it will be straight forward. We need to draw a conclusion about comparing more than two populations based on a mean. The correct test is the analysis of variance (ANOVA) test.

Execute the Plan

The students collected all of the data. Here is the data table:

Table 1: Time to service in seconds for four trials each of three restaurants.
McDonaldsSonicTaco Bell
274218341
227202322
296147256
250170302

Analyze the Data

[Taco Bell]

Taco Bell in Cyprus. Photo courtesy Wikimedia Commons.

Now, with the data, we can test the null hypothesis. In Excel, we input the data in columns. Then, we can follow the menu trail: Data | Data Analysis | Anova: Single Factor. This is “Single Factor” ANOVA because we have one independent variable: the restaurant. In most places, this is called “one-way.”

Interpret the Results

Here is the output given by Excel

Table 2: Automated ANOVA output from Excel.
Anova: Single Factor      
       
SUMMARY      
Groups Count Sum Average Variance   
McDonalds 4 1047 261.75 889.5833   
Sonic 4 737 184.25 1014.9167   
Taco Bell 4 1221 305.25 1331.5833   
       
       
ANOVA      
Source of Variation SS df MS F P-value F crit
Between Groups 30052.66 2 15026.33 13.93 0.0018 4.2565
Within Groups 9708.25 9 1078.69      
             
Total 39760.92 11        

According to Excel, the p-value is 0.0018 (the test statistic is F=13.93). As the p-value is less than our α=0.05, we reject the null hypothesis and conclude that there is a difference in the average serving time across these three restaurants. At least one has a significantly faster time than another. We do not know which, but we know the average serving times are not the same.

The Discussion

And so, after all of that work, we have concluded that there is a difference. At least one population mean differs from the rest. Ultimately, this is an unsatisfactory answer. We know one differs. A better question is: Which is fastest? To answer that, we will need to perform post hoc tests.

And that is it. This example showed how to test if all means are the same. Here, we were able to determine that at least one of the three restaurants was faster than the rest.