"The (named) Children and Young People's Partnership has benefitted from the results of the SHEU survey locally for many years now, and we should like to continue to do so in future."
The Infinite Monkey Rule
The Infinite Monkey Rule
I always learn things from listening to the radio. On 'The Infinite Monkey Cage', I heard Brian Cox rattle off a 'rule of thumb', which goes something like:
A number plus or minus the square root of the sample size is consistent with random sampling error.
https://www.bbc.co.uk/programmes/b04yfsst (listen from 20 minutes in)
I hadn't heard that one, but deploying those key research tools, the back of an envelope and a pencil, I could see where it comes from.
I hope I am allowed to paraphrase Brian's rule:
If the excess (or deficit) number of cases meeting a criterion is more than the square root of the sample size, it's statistically significant.
The way I'd normally look for a statistically significant difference in a proportion is calculating the standard error of proportion (SEP), and using twice that (actually 1.96) to find the 95% confidence interval (CI). For a percentage of 50% (proportion = 0.5) in a sample of 100, it works out as exactly 10%. Suppose we toss a coin a hundred times: we'd expect 50 heads and 50 tails, but we would accept 10% either way, 40%-60%. Anything outside that range suggests something unusual has happened -- a biased coin, a biased recording method -- but maybe just an unusual run of results.
| 0% | 10% | 20% | 30% | 40% | 50% | 60% | 70% | 80% | 90% | 100% | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| p=50% | n= | 100 | |||||||||
| 5% | SEP | ||||||||||
| 10% | 2xSEP | ||||||||||
| -10% | +10% | ±2xSEP | |||||||||
| 40% | <--- | ---> | 60% | ||||||||
| 95% CI | |||||||||||
Anyhow, to get to the 'square root of the sample size', we can derive it from the formula:
- For a proportion p seen in a sample n, SEP=root((p(1-p)/n)

- The SEP will be highest for p=0.5 (p(1-p)=0.25) and lowest for something near 0.1 (p(1-p)=0.09).
- For a sample n, the biggest 95% CI = 2*root(0.25/n) = 2*0.5/root(n) = 1/root(n).
- So, if the excess proportion p is expressed as a fraction m/n, this needs to be at least 1/root(n).
- We can cross-multiply by n to get m >= n/root(n) >= root(n), which is the size of Brian Cox's thumb.
So, for a sample size of 100, we're looking for a difference larger than 10, the square root of 100; for a sample of 50, a difference greater than 7.
This rule uses the 'worst case' of a proportion at or near 50%, when the 95% CI is at its widest, so sometimes a smaller difference will be statistically significant.