User Tools

Site Tools


r:chi-square

Chi-Square Examples

Examples adapted from the following work:

Pagano, R. R. (2002). Understanding statistics in the behavioral sciences (6th ed.). Belmont, CA: Wadsworth.

There are two catches to be aware of when running chi-square tests in R:

  1. Don't assume the vector of probabilities to be equal. See example below.
  2. Always check the degrees of freedom are correct for the test. If doing a chisq.test on a table of data, if the table isn't correct, the chisq.test may look at the wrong relationships.
    1. For a Goodness-of-Fit Test, the $df = (k - 1)$.
    2. For a Test of Independence, the $df = (r - 1)(c -1)$

Chi-Square Single Variable (Goodness of Fit)

Based on pages 422-423 of the above book.

A researcher believes the ethnic populations of a city has changed since data was last taken. When data was last taken, the breakdown was:

  • 53% Norwegian
  • 32% Italian
  • 8% Irish
  • 5% Hispanic
  • 2% Italian

New data is collected from a random sample of 750 inhabitants of the city. The results are:

Norwegian Swedish Irish Hispanic Italian
399 193 63 82 13

Use the chisq.test and test the observed frequencies (the new count) against the percentages of the previous data (the null probabilities).

$H_0:$ The ethnic population has not changed in composition.

new_count  <- c(399, 193, 63, 82, 13)
null_probs <- c(0.53, 0.32, 0.08, 0.05, 0.02)
chisq.test(new_count, p = null_probs)

The result: $X^2 = 62.433, df = 4, p < 0.001$. Since $p < 0.05$, then we reject $H_0$.

Example 2

The above is a good example of when to adjust the expected probabilities and how to in R. By default, the chisq.test assumes that probabilities are equal across all categories of observations. Thus, assume the following scenario (Mendenhall, Beaver, & Beaver, p. 597, 2006):

Door
Green Red Blue
Observed Count 20 39 31

Without prior knowledge, the default null hypothesis is:

$H_0: p_1 = p_2 = p_3 = \frac{1}{3}$

R assumes the default $H_0$ too, and the code would be:

count <- c(20, 39, 31)
chisq.test(count)

The result: $X^2 = 6.0667, df = 2, p < 0.04815$. Since $p < 0.05$, then we reject $H_0$.

We can also be explicit about the vector of probabilities, as we were in the example above, and even if those probabilities are equal. The results for the following are the same:

chisq.test(count, p = c(1/3, 1/3, 1/3))

Alternate way to write the above:

chisq.test(count, p = rep(1/3, 3))

Test of independence between two variables

Based on page 424 of the above book. Here the table has to be built correctly in R for the analysis to work.

$H_0:$ Political affiliation and attitude toward some bill are independent.

Attitude
For Undecided Against Row Marginal
Republican 68 22 110 200
Democrat 92 18 90 200
Column Marginal 160 40 200 400

First, let's build the table:

party <- as.table(rbind(c(68, 22, 110), c(92, 18, 90)))
dimnames(party) <- list(affiliation = c("Republican", "Democrat"),
                        attitude = c("For", "Undecided", "Against"))

Let's examine the table:

party
            attitude
affiliation   For Undecided Against
  Republican  68        22     110
  Democrat    92        18      90

And a chisq.test(party) shows that:

$X^2 = 6, df = 2, p < 0.04979$. Since $p < 0.05$, then we reject $H_0$.

Example 2

From Mendenhall, Beaver, & Beaver (p. 602), we have the following contingency table:

Shift
Type of Defects 1 2 3 Total
1 15 26 33 74
2 21 31 17 69
3 45 34 49 128
4 13 5 20 38
Total 94 96 119 309

$H_0:$ Type of Defects and Shift are independent.

We can build the table in R:

defects <- as.table(rbind(c(15, 26, 33),
                          c(21, 31, 17),
                          c(45, 34, 49),
                          c(13, 5, 20)))
dimnames(defects) <- list(type = c("A", "B", "C", "D"),
                          shift = c("1", "2", "3"))

And the chisq.test results in:

$X^2 = 19.178, df = 6, p < 0.003873$. Since $p < 0.05$, then we reject $H_0$.

References

Mendenhall, W., Beaver, R. J., & Beaver, B. M. (2006). Introduction to Probability and Statistics (12th ed.). Australia: Thomson Books/Cole.

r/chi-square.txt · Last modified: 2017/02/22 10:51 by seanburns