Part 3: Statistical Inference

In the first part of this course, we examined sample statistics. In the second part, we looked at probability distributions. This final part is the culmination of the findamental purpose of statistics: How to use a sample of data to draw conclusions about the population that gave us that data.

This part is the most extensive of the three parts. In the beginning, it defines two important concepts: the confidence interval and the p-value. From those definitions, this part introduces statistical procedures for many of the most basic data types and questions. Your job is to figure out a way of determining which procedure to use for a given question.

Learning Objectives

As you work your way through this final part of the course, make sure you are able to…

understand confidence intervals;
understand hypothesis tests;
determine which statistical procedure is appropriate for a given question; and
correctly perform that procedure.

Reading Assignments

In your textbook, make sure you carefully read and take notes on the following sections:

Chapter 8: 1–5, Appendix 8.1
Chapter 9: 1–5, Appendix 9.1
Chapter 10: 1–3, Appendix 10.1
Chapter 11: 1–4, Appendix 11.1
Chapter 12: 1–2, Appendix 12.1
Chapter 13: 1–7, Appendix 13.1

Remember that the chapter appendices give you an introduction to performing these calculations in Excel.

Supplemental Materials

Because this course part is so lengthy, I have broken the supplementary materials into sections based on the procedures examined.

One-Population Procedures

This first section deals with statistical procedures on a single population parameter, μ or p. These include the one-sample t-procedure (for the mean) and the one-sample z-procedure (for the proportion). This section also examines the meanings of confidence intervals and p-values. These two concepts are central to the understanding of statistical inference.

Chapters: 8 and 9

Confidence Intervals. In the previous modules, we have only concentrated on calculating a sample statistic, a stastistic for the sample we collected. Confidence intervals allow us to take that sample and estimate population statistics.

Confidence Intervals for means, with σ known.

Confidence Intervals for means, with σ unknown.

Confidence Intervals for proportions.

The p-value. In the previous module, we focused on the population parameter of interest, μ or p, and estimated it using a range of values that were reasonable for our definition of “reasonable.” That was the confidence interval. In this mini-lecture, we test a hypothesis about a population parameter of interest, μ or p, and determine if the hypothesized value is reasonable.

p-values for means, with σ known.

p-values for means, with σ unknown.

p-values for proportions.

The GM Recall Example, Part II. In the first GM Recall Example, we saw how we could use probability theory to make decisions. All it took was us knowing the distribution of the population. In this series of examples, we no longer know the distribution of the population. However, we are still able to estimate some important parameters, the population mean and the population proportion.

The School Zone Example. This example starts with a problem statement: What proportion of drivers speed through the school zone? From that research question, the entire statistical process is laid bare. From collecting data, to analyzing that data, to interpreting the results.

The School Zone Example, Part II. This example starts with a problem statement: What proportion of drivers speed through the school zone? From that research question, the entire statistical process is laid bare. From specifying a research hypothesis, to collecting data, to analyzing that data, to interpreting the results.

Two-Population Procedures

This second section deals with statistical procedures comparing two population parameters. These include the two-sample t-procedure (for comparing means) and the two-sample z-procedure (for comparing two proportions). This section relies on your understanding of confidence intervals and p-values. Review the previous section to refresh your memory.

Chapter: 10

Practice found at Project Scarlet:

procedures for two independent means, with σ unknown.

procedures for two paired-sample means, with σ unknown.

procedures for two proportions.

Tall Students. This answers the age-old question about whether males are taller than females. This is answered using a two-population independent samples means procedure.

Teaching Statistics. This example answers the question whether students improved their knowledge of statistics between the start and the end of a unit. This uses a two-population dependent samples means procedure.

Seat Belts and School Zones. This example walks you through the scientific method to determine if those wearing seatbelts tended to obey speed limits more than those who did not. This uses two-population proportions procedures.

ANOVA Procedures

This section deals exploring and testing equality of population means for more than two populations.

Chapter: 11

Practice found at Project Scarlet:

ANOVA.

Faster Food. This example goes through the scientific method for comparing the means of three populations. Those population means are service times for three fast-food restaurants in Stilwater OK. The procedure used is analysis of variance.

Which is the Faster Food? This example completes the previous example. Previously, we discovered that at least one mean (response time) differed. We did not determine which was different and whether it was faster or slower. To answer these questions, we must perform post hoc tests. This example shows how to do this.

Chi-Square Procedures

This section deals exploring and testing relationships between two (or more) categorical variables (chi-square test of independence), and for testing whether the distribution of one categorical variable matches the hypothesized distribution (chi-square goodness-of-fit test).

Chapter: 12

Practice found at Project Scarlet:

Chi-square Goodness-of-Fit Test.

A Fair Coin? This gives an example of using the Chi-Square Goodness-of-Fit test in real life with a toy example of determining if a 2016 US penny is fair if we spin it instead of flipping it.

The Egyptian Referendum. This gives an example of using the Chi-Square Goodness-of-Fit test in real life… at least in real life for those of us who analyze elections.

Regression Procedures

This section deals exploring and testing relationships between two (or more) numeric variables.

Chapters: 13 and 14

Practice found at Project Scarlet:

calculating the sample correlation.

procedures for linear regression.

The Nobel Prize in Chocolate. This example explores the question that creates chocolate consumption and Nobel Prizes. Is there a relationship? If so, what may cause it?