Some experiments have a binary or discrete outcome, such as the number of animals with and without some attribute such as a tumour. For example, the data below shows the number of animals of genotype AA and AB which develop tumours following treatment with a carcinogen. Animals of the AA genotype have fewer tumours, but the question is whether this could simply be due to chance sampling variation.
Genotype | Tumour | No tumour | Percent tumours |
AA | 13 | 5 | 27.8 |
AB | 12 | 11 | 47.8 |
Table of this type can be analysed in several ways. Where there are just two groups, the significance of any difference can be assessed using the normal approximation of the binomial distribution. Alternatively the independence of rows and columns can be tested using a chi-squared test. This latter method can be used when there are several groups and several outcomes. However, neither test is accurate when the numbers in any cell are too low. In the case of the chi-squared the “expected” numbers in any cell should not be less than five. The “expected number” (under the null hypothesis of no differences) is the product of the row and column total for that cell, divided by the grand total. For example, the expected number of tumours in cell two above (where there are five observed tumours) is ((5+11)x(13+5))/(13+5+12+11) = 7.02
Where the expected numbers in some cells falls below five, an alternative is to use Fisher’s exact test which is accurate even with small expected values in some cells (not discussed here).