- We switch to Zoom next week. Look for the link on the Meetups page. I will also post it in Slack.
March 11, 2020
Two scientists want to know if a certain drug is effective against high blood pressure. The first scientist wants to give the drug to 1,000 people with high blood pressure and see how many of them experience lower blood pressure levels. The second scientist wants to give the drug to 500 people with high blood pressure, and not give the drug to another 500 people with high blood pressure, and see how many in both groups experience lower blood pressure levels. Which is the better way to test this drug?
The GSS asks the same question, below is the distribution of responses from the 2010 survey:
Response | n |
---|---|
All 1000 get the drug | 99 |
500 get the drug 500 don’t | 571 |
Total | 670 |
Parameter of interest: Proportion of all Americans who have good intuition about experimental design.
\[p(population\; proportion)\]
Point estimate: Proportion of sampled Americans who have good intuition about experimental design.
\[\hat{p}(sample\; proportion)\]
What percent of all Americans have good intuition about experimental design (i.e. would answer “500 get the drug 500 don’t?”
Using a confidence interval \[point\; estimate \pm ME\]
We know that ME = critical value x standard error of the point estimate. \[SE_{\hat{p}} = \sqrt{\frac{p(1-p)}{n}}\]
Sample proportions will be nearly normally distributed with mean equal to the population mean, p, and standard error equal to \(\sqrt{\frac{p(1-p)}{n}}\).
\[\hat { p } \sim N\left( mean=p,SE=\sqrt { \frac { p(1-p) }{ n } } \right) \]
This is true given the following conditions:
Given: \(n = 670\), \(\hat{p} = 0.85\).
Conditions:
Independence: The sample is random, and 670 < 10% of all Americans, therefore we can assume that one respondent’s response is independent of another.
Success-failure: 571 people answered correctly (successes) and 99 answered incorrectly (failures), both are greater than 10.
Given: \(n = 670\), \(\hat{p} = 0.85\).
\[0.85 \pm 1.96 \sqrt{\frac{0.85 \times 0.15}{670}} = \left(0.82,\; 0.88\right)\]
We are 95% confidence the true proportion of Americans that have a good intuition about experimental designs is betwee 82% and 88%.
Suppose you want a 3% margin of error, how many people would you have to survey?
Use \(\hat{p} = 0.5\)
\[0.03 = 1.96 \times \sqrt{\frac{0.5 \times 0.5}{n}}\] \[0.03^2 = 1.96^2 \times \frac{0.5 \times 0.5}{n}\] \[n \approx 1,068\]
Scientists predict that global warming may have big effects on the polar regions within the next 100 years. One of the possible effects is that the northern ice cap may completely melt. Would this bother you a great deal, some, a little, or not at all if it actually happened?
Response | GSS | Duke |
---|---|---|
A great deal | 454 | 69 |
Some | 124 | 40 |
A little | 52 | 4 |
Not at all | 50 | 2 |
Total | 680 | 105 |
Parameter of interest: Difference between the proportions of all Duke students and all Americans who would be bothered a great deal by the northern ice cap completely melting.
\[p_{Duke} - p_{US}\]
Point estimate: Difference between the proportions of sampled Duke students and sampled Americans who would be bothered a great deal by the northern ice cap completely melting.
\[\hat{p}_{Duke} - \hat{p}_{US}\]
Standard error of the difference between two sample proportions
\[SE_{\hat{p}_1 - \hat{p}_2} = \sqrt{ \frac{p_1\left(1 - p_1\right)}{n_1} + \frac{p_2\left(1 - p_2\right)}{n_2} }\]
Conditions:
Construct a 95% confidence interval for the difference between the proportions of Duke students and Americans who would be bothered a great deal by the melting of the northern ice cap (\(p_{Duke} - p_{US}\)).
Data | Duke | US |
---|---|---|
A great deal | 69 | 454 |
Not a great deal | 36 | 226 |
Total | 105 | 680 |
\(\hat{p}\) | 0.657 | 0.668 |
\[ \left(\hat{p}_{Duke} - \hat{p}_{US}\right) \pm z* \times \sqrt{ \frac{p_{Duke}\left(1 - p_{Duke}\right)}{n_{Duke}} + \frac{p_{US}\left(1 - p_{US}\right)}{n_{US}} } \]
\[(0.657 - 0.668) \pm 1.96 \times \sqrt{\frac{0.657 \times 0.343}{105} + \frac{0.668 \times 0.332}{680}} = \left(-0.108,\; 0.086\right)\]
In 2009, Zacariah Labby (U of Chicago), repeated Weldon’s experiment using a homemade dice-throwing, pip counting machine. http:// www.youtube.com/ watch?v= 95EErdouO2w
The table below shows the observed and expected counts from Labby’s experiment.
Outcome | Observed | Expected |
---|---|---|
1 | 53,222 | 52,612 |
2 | 52,118 | 52,612 |
3 | 52,465 | 52,612 |
4 | 52,338 | 52,612 |
5 | 52,244 | 52,612 |
6 | 53,285 | 52,612 |
Total | 315,672 | 315,672 |
Do these data provide convincing evidence of an inconsistency between the observed and expected counts?
\(H_0\): There is no inconsistency between the observed and the expected counts. The observed counts follow the same distribution as the expected counts.
\(H_A\): There is an inconsistency between the observed and the expected counts. The observed counts do not follow the same distribution as the expected counts. There is a bias in which side comes up on the roll of a die.
\[\frac{\text{point estimate} - \text{null value}}{\text{SE of point estimate}}\]
When dealing with counts and investigating how far the observed counts are from the expected counts, we use a new test statistic called the chi-square (\(\chi^2\)) statistic.
\[\chi^2 = \sum_{i = 1}^k \frac{(O - E)^2}{E} \qquad \text{where $k$ = total number of cells}\]
Outcome | Observed | Expected | \(\frac{(O - E)^2}{E}\) |
---|---|---|---|
1 | 53,222 | 52,612 | \(\frac{(53,222 - 52,612)^2}{52,612} = 7.07\) |
2 | 52,118 | 52,612 | \(\frac{(52,118 - 52,612)^2}{52,612} = 4.64\) |
3 | 52,465 | 52,612 | \(\frac{(52,465 - 52,612)^2}{52,612} = 0.41\) |
4 | 52,338 | 52,612 | \(\frac{(52,338 - 52,612)^2}{52,612} = 1.43\) |
5 | 52,244 | 52,612 | \(\frac{(52,244 - 52,612)^2}{52,612} = 2.57\) |
6 | 53,285 | 52,612 | \(\frac{(53,285 - 52,612)^2}{52,612} = 8.61\) |
Total | 315,672 | 315,672 | 24.73 |
Squaring the difference between the observed and the expected outcome does two things:
In order to determine if the \(\chi^2\) statistic we calculated is considered unusually high or not we need to first describe its distribution.
When conducting a goodness of fit test to evaluate how well the observed data follow an expected distribution, the degrees of freedom are calculated as the number of cells (\(k\)) minus 1.
\[df = k - 1\]
For dice outcomes, \(k = 6\), therefore \(df = 6 - 1 = 5\)
p-value = \(P(\chi^2_{df = 5} > 24.67)\) is less than 0.001
The p-value for a chi-square test is defined as the tail area above the calculated test statistic.
This is because the test statistic is always positive, and a higher test statistic means a stronger deviation from the null hypothesis.