Brian Gilp wrote:Alright Ian I almost have you over to my side.
Have you been playing with those naughty Sith again Brian...
Brian Gilp wrote: Now consider this. Using your extreme example of coin flipping which I agree with and is the basis for my arguement on variance impacts over small sample sizes, there is absolutely no reason why that could not happen on the first occasion as opposed to the 878th. And if it did happen on the first occasion it should take a significant amount of future occassions before the results start to resemble what would be expected for a random distribution.
A good example. There is indeed no less chance of the 10 heads occurring on the 1st that the 878th, true. However the chance of it occurring in any particular round is very small indeed. If by chance this ocurred on the 1st round, then yes an analysis of the results after 5 or 10 rounds may well throw a significant result up. This cuts right to the core of stats tests (and also touches on the crucial word 'significance'). They never prove anything 100%. What they do is give you (for example) no, weak, strong, very strong evidence when a theory is tested. There can always be extreme results (such as getting a 10 heads 1st up on an unbiased coin). In the coin toss example the extremely unlikely event of 10 Heads is the bit a statistician always needs to caveat for.
e.g by saying "There is very strong evidence to suggest the coin is biased", rather than "The coin is biased". The coin situation is a specific situation where we know it's not biased. What if I said I'd spun a coin 20 times and it came down tails 16 times? Is this a rogue outlier? (unlikely event) As it happens, it depends on the coin. Some are indeed biased and will average as much as 7 tails to 3 heads. A classic stats experiment and can be great fun to run, as people can be shocked at the bias (I blame the big heads of politicians and royalty

).
If the choices are random and solely based on the wine, then over time (maybe quite a long period) the choices will indeed tend to level up. However none of us know, hence the need to collect data and test it. Steve has done the maths and found that had 1 person tasted through these 19 events and got A:9 B:2 C:2 D:2 E:2 & F:2, the test would have shown very strong evidence of bias. These tests are well established and (caveat - certain conditions must be met - one of which is not knowing the results in advance!) rigorous. It may be surprising (it usually is), but there can be huge power in what seems a small test. Throwing in the actual 50 x 19 data points and it makes it likely that there would be overwhelming (but never 100%) evidence either way.
Brian Gilp wrote:If as I propose, the original 20 tasting events are just an extreme example such as your 10 heads in a row example, it should not be expected that the results would normalize in only 10 more tastings and that bias would continue to exist for some time. I did not intend to suggest that the results would normalize in the next 34 tastings but that the likelihood of any wine not being choosen over many tastings is greater than one would assume and that it would not be shocking to find a period of tastings in the future where A is not the group favorite.
To repeat something I wrote before just to be clear. I am not attempting to state that there is not a bias in the results. I am stating that due to the fact that these are not controlled experiments and the small sample size that the impact of extreme random events can not be eliminated as the cause.
No matter what the sample size, extreme random events are always possible. In the coin toss example, it is feasible that the 1st round would give 10 heads, followed by exactly the same on the 2nd toss. Possible and incredibly unlikely, but it's always worth bearing that in mind and that's why a statistician never claims to prove a theory, just that his/her evidence and analysis suggests that it's highly likely to be true.
regards
Ian