Ranking and Preferences

by **Ian Sutton** » Tue Sep 22, 2009 1:19 pm

Brian
If looking at purely grouped data - yes you're looking at a small sample size. However grouping of data into 20 datapoints would throw away a huge volume of powerful data. Such grouping is handy to get a visual feel for the data, but not of benefit to establishing firm conclusions.

Why might you group?
- To test whether specific individuals have preferences influenced by order of presentation
- To test whether specific wine styles (e.g. a Loire tasting vs. a Shiraz tasting) have lesser or greater impact
... but in each instance you retain the lowest level of data to feed into the stats meat grinder :wink:

, but assign groups so you can analyse at the levels you wish (e.g. individual tester, individual tasting event). In reality, most multivariate regression analyses will feed the grouped analysis out for free, as long as you defined the groups at the start.

One of the key aspects of stats testing, is (generally) to work with as detailed data as possible, grouping only at the end point of analysis. Summarise before you start and the power goes out of the test (as in this situation, the power of 1000 data points gets reduced to 20).

regards

Ian

p.s. for general interest, stats tests don't say whether a result is significant or not, they give a level of significance.
e.g. A result that had a <1% chance of occuring randomly might be described as Highly significant, one that had ~ 5% chance of occuring might be described as significant at a 5 % level (of probability). There was accepted terminology IIRC, but the appropriate words for the various levels of significance have long since vacated my brain :oops:

by **Steve Slatcher** » Tue Sep 22, 2009 4:05 pm

Even after grouping, comparing the 9/20 with the 1/6 you would expect assuming randomness, the results are clearly significant using a binomial test. I am not going to try to be more precise - I tried to in my last post and gave up - but if you simply look at the 95% and 99% confidence limits (for a sample size of 20) on 9/20, neither of them include 1/6.

As far as I can see the simple binomial test is appropriate, but I am sure that if you use more powerful tests suggested by Ian the results will be even more stark.

by **Victorwine** » Tue Sep 22, 2009 9:10 pm

Hi Jenise,
There is no way the participants could mix up the six glasses in front of them? Are “placemats”, stickers or those disks with the slit in them to slide past the stem used?

Salute

by **Jenise** » Wed Sep 23, 2009 5:06 am

Victor, no. They're provided a laminated, pre-printed placemat with circles A, B, C and so on to help them arrange and manage their glassware. So though I suppose nothing prohibits the occasional juggler from switching out a pair, I think we can assume for the purpose of evaluation that the wine A voted for is the wine A poured them.

by **Brian Gilp** » Wed Sep 23, 2009 8:24 am

Let me try a different way to explain my concern with the sample size.

For the sake of illustration lets assume that for the first 12 tastings every wine was the group favorite 2 times. Then for the next straight 7 tastings wine A was the group favorite making it the favorite in 9 of the 19 tastings held. The probability that wine A will be the favorite 7 straight times is 1 in 280,000. While improbable this is not impossible. In fact it is more probable than opening 4 straight flawed bottles of wine (1 in 330,000 assuming 1 flawed bottle for every 2 cases).

Looking at it the other way, wine F would have to be selected the favorite in 3 of the next 4 tastings to bring it to what would be expected. The probability that wine F would be selected 3 straight times is 1 in 216. Selecting it 3 out of the next 4 times makes it even more probable.

If one instead looks at the probability that wine A will not be selected as the favorite again for some time one gets the following. The probability that wine A will not be selected in the next 10 tastings is 1 in 6, in the next 16 tastings is 1 in 50, and in the next 34 tastings is 1 in 500.

With only 20 events recorded, there is still reasonable probability that the distribution will look significantly different once the next 20 events are completedand that the appearance of bias will be significantly reduced.

by **Ian Sutton** » Wed Sep 23, 2009 9:56 am

Brian
Assume (as the simple grouping becomes equivalent to) that it's just 1 person choosing the favourite in each of 19 tastings.

Taking Steve's analysis on trust (and my gut feel is that it looks about right), he's saying that the results experienced had less than a 1% chance of happening if the chance of a wine winning was actually always 1 in 6.

Now you say that there is a one in 6 chance of wine A not being chosen in the next 10 events.

That might leave us with (say)
A:9
B:4
C:4
D:5
E:2
F:3
That still looks pretty much to favour wine A

As a bit of fun - to get the group average back to 9 each, then A must go on a run of 35 events without a win (a 0.17% chance of this occurring - i.e. very unlikely indeed).

I think this sort of scenario does often shock people, of just how powerful some results can be, when they seem not too unlikely on the face of it. I recall the company I worked for doing an annual fete, with one stall being 'roll the dice'. There were six dice and if you got 6 sixes you won a car. Seems easy enough, but as you'll be able to check, this is only a 0.002% chance. The stall was hugely popular.

... However, there is one part of this that should stop us drawing too firm conclusions - that Jenise saw the results before letting us know. She reported something seemingly out of the ordinary, but presumably wouldn't have done so if the split had been more even (or if her favourite had won :wink:

(friendly joshing)).

As an (extreme) example I toss a coin 10 times on 1000 separate occasions. On the 878th occasion I get 10 heads, which individuallly has a probability of less than 0.1% of occuring. That's significant isn't it?! The coin is biased!! Well not if you see the results and then perform the test only on the one that piqued your interest. Getting one instance of 10 heads in 1000 attempts is quite unremarkable (I make it just over a 1 in 3 chance).

regards

Ian

by **James Dietz** » Wed Sep 23, 2009 10:55 am

Brian Gilp wrote:Let me try a different way to explain my concern with the sample size.

For the sake of illustration lets assume that for the first 12 tastings every wine was the group favorite 2 times. Then for the next straight 7 tastings wine A was the group favorite making it the favorite in 9 of the 19 tastings held. The probability that wine A will be the favorite 7 straight times is 1 in 280,000. While improbable this is not impossible. In fact it is more probable than opening 4 straight flawed bottles of wine (1 in 330,000 assuming 1 flawed bottle for every 2 cases).

Looking at it the other way, wine F would have to be selected the favorite in 3 of the next 4 tastings to bring it to what would be expected. The probability that wine F would be selected 3 straight times is 1 in 216. Selecting it 3 out of the next 4 times makes it even more probable.

If one instead looks at the probability that wine A will not be selected as the favorite again for some time one gets the following. The probability that wine A will not be selected in the next 10 tastings is 1 in 6, in the next 16 tastings is 1 in 50, and in the next 34 tastings is 1 in 500.

With only 20 events recorded, there is still reasonable probability that the distribution will look significantly different once the next 20 events are completedand that the appearance of bias will be significantly reduced.

This was actually the first thing that popped into my mind when I read Jenise's post.. too small of a sample size.

For example, I participate in an office football pool. What I do is to simply flip a quarter to decide which of the paired teams to `choose', since the Vegas line is so good, the outcome is simply one of chance. The first week I played, I flipped 12 heads in a row, meaning I took the favored team 12 times in a row. Then I flipped a tail, then about 4 more heads. But I also know, if I flippped and flipped 1000 times, I'd be pretty close to the 50/50 mark, just like at the end of the season, virtually all of the players in the pool are very close to .500.

by **Brian Gilp** » Wed Sep 23, 2009 12:28 pm

Ian Sutton wrote:Brian
Assume (as the simple grouping becomes equivalent to) that it's just 1 person choosing the favourite in each of 19 tastings.

Taking Steve's analysis on trust (and my gut feel is that it looks about right), he's saying that the results experienced had less than a 1% chance of happening if the chance of a wine winning was actually always 1 in 6.

Now you say that there is a one in 6 chance of wine A not being chosen in the next 10 events.

That might leave us with (say)
A:9
B:4
C:4
D:5
E:2
F:3
That still looks pretty much to favour wine A

As a bit of fun - to get the group average back to 9 each, then A must go on a run of 35 events without a win (a 0.17% chance of this occurring - i.e. very unlikely indeed).

I think this sort of scenario does often shock people, of just how powerful some results can be, when they seem not too unlikely on the face of it. I recall the company I worked for doing an annual fete, with one stall being 'roll the dice'. There were six dice and if you got 6 sixes you won a car. Seems easy enough, but as you'll be able to check, this is only a 0.002% chance. The stall was hugely popular.

... However, there is one part of this that should stop us drawing too firm conclusions - that Jenise saw the results before letting us know. She reported something seemingly out of the ordinary, but presumably wouldn't have done so if the split had been more even (or if her favourite had won (friendly joshing)).

As an (extreme) example I toss a coin 10 times on 1000 separate occasions. On the 878th occasion I get 10 heads, which individuallly has a probability of less than 0.1% of occuring. That's significant isn't it?! The coin is biased!! Well not if you see the results and then perform the test only on the one that piqued your interest. Getting one instance of 10 heads in 1000 attempts is quite unremarkable (I make it just over a 1 in 3 chance).

regards

Ian

Alright Ian I almost have you over to my side. Now consider this. Using your extreme example of coin flipping which I agree with and is the basis for my arguement on variance impacts over small sample sizes, there is absolutely no reason why that could not happen on the first occasion as opposed to the 878th. And if it did happen on the first occasion it should take a significant amount of future occassions before the results start to resemble what would be expected for a random distribution. Using simple math to illustrate, if every subsequent occasion of 10 flips was distributed 6 tails and 4 heads it would take 5 more occassions of 10 flips each before the results were back to 50/50. That's 5 times the original sample size.

If as I propose, the original 20 tasting events are just an extreme example such as your 10 heads in a row example, it should not be expected that the results would normalize in only 10 more tastings and that bias would continue to exist for some time. I did not intend to suggest that the results would normalize in the next 34 tastings but that the likelihood of any wine not being choosen over many tastings is greater than one would assume and that it would not be shocking to find a period of tastings in the future where A is not the group favorite.

To repeat something I wrote before just to be clear. I am not attempting to state that there is not a bias in the results. I am stating that due to the fact that these are not controlled experiments and the small sample size that the impact of extreme random events can not be eliminated as the cause.

by **Ian Sutton** » Wed Sep 23, 2009 1:31 pm

Brian Gilp wrote:Alright Ian I almost have you over to my side.

Have you been playing with those naughty Sith again Brian... :wink:

Brian Gilp wrote: Now consider this. Using your extreme example of coin flipping which I agree with and is the basis for my arguement on variance impacts over small sample sizes, there is absolutely no reason why that could not happen on the first occasion as opposed to the 878th. And if it did happen on the first occasion it should take a significant amount of future occassions before the results start to resemble what would be expected for a random distribution.

A good example. There is indeed no less chance of the 10 heads occurring on the 1st that the 878th, true. However the chance of it occurring in any particular round is very small indeed. If by chance this ocurred on the 1st round, then yes an analysis of the results after 5 or 10 rounds may well throw a significant result up. This cuts right to the core of stats tests (and also touches on the crucial word 'significance'). They never prove anything 100%. What they do is give you (for example) no, weak, strong, very strong evidence when a theory is tested. There can always be extreme results (such as getting a 10 heads 1st up on an unbiased coin). In the coin toss example the extremely unlikely event of 10 Heads is the bit a statistician always needs to caveat for.
e.g by saying "There is very strong evidence to suggest the coin is biased", rather than "The coin is biased". The coin situation is a specific situation where we know it's not biased. What if I said I'd spun a coin 20 times and it came down tails 16 times? Is this a rogue outlier? (unlikely event) As it happens, it depends on the coin. Some are indeed biased and will average as much as 7 tails to 3 heads. A classic stats experiment and can be great fun to run, as people can be shocked at the bias (I blame the big heads of politicians and royalty :mrgreen:

).

If the choices are random and solely based on the wine, then over time (maybe quite a long period) the choices will indeed tend to level up. However none of us know, hence the need to collect data and test it. Steve has done the maths and found that had 1 person tasted through these 19 events and got A:9 B:2 C:2 D:2 E:2 & F:2, the test would have shown very strong evidence of bias. These tests are well established and (caveat - certain conditions must be met - one of which is not knowing the results in advance!) rigorous. It may be surprising (it usually is), but there can be huge power in what seems a small test. Throwing in the actual 50 x 19 data points and it makes it likely that there would be overwhelming (but never 100%) evidence either way.

Brian Gilp wrote:If as I propose, the original 20 tasting events are just an extreme example such as your 10 heads in a row example, it should not be expected that the results would normalize in only 10 more tastings and that bias would continue to exist for some time. I did not intend to suggest that the results would normalize in the next 34 tastings but that the likelihood of any wine not being choosen over many tastings is greater than one would assume and that it would not be shocking to find a period of tastings in the future where A is not the group favorite.

To repeat something I wrote before just to be clear. I am not attempting to state that there is not a bias in the results. I am stating that due to the fact that these are not controlled experiments and the small sample size that the impact of extreme random events can not be eliminated as the cause.

No matter what the sample size, extreme random events are always possible. In the coin toss example, it is feasible that the 1st round would give 10 heads, followed by exactly the same on the 2nd toss. Possible and incredibly unlikely, but it's always worth bearing that in mind and that's why a statistician never claims to prove a theory, just that his/her evidence and analysis suggests that it's highly likely to be true.

regards

Ian

by **Jenise** » Wed Sep 23, 2009 2:04 pm

This sure has been interesting.

Ian said:

... However, there is one part of this that should stop us drawing too firm conclusions - that Jenise saw the results before letting us know. She reported something seemingly out of the ordinary, but presumably wouldn't have done so if the split had been more even (or if her favourite had won (friendly joshing)).

Well, yes/no. Re the if my favorite had won: I realize you were joshing but I have to nonetheless offer that you're so wrong there. Rather, I always expect my favorite to place 5th or 6th with the group. In that respect, this tasting was completely normal. However, after pouring wines for this group for several years, I'm often able to predict after I taste all six which wine will come out on top. This time they picked neither my favorite nor the wine I thought would be theirs, but basically picked for 1st and 2nd what were the two weakest wines in the bunch. That caused me to playback in my head the comments of one woman at my table as she waxed rhapsodic about wine A, the first wine she tasted, and then commented when she got to wine E that they all tasted the same and she was just going to go with Wine A. Wine E, in fact, was the standout of the group for both concentration and complexity, a wine another gentleman at my table pronounced "heads and shoulders above the rest" and which I'm confident you or anyone else participating in this thread would also recognize. There was almost no defense of wine A for first place EXCEPT that it was the first wine tasted.

Had wines B thru F had the same rate of success as wine A, or even had the non-A wins been spread out fairly evenly over the remaining five slots, we wouldn't be having this discussion.

by **Ian Sutton** » Wed Sep 23, 2009 3:01 pm

Jenise wrote:
Well, yes/no. Re the if my favorite had won: I realize you were joshing but I have to nonetheless offer that you're so wrong there.

I know

I probably wouldn't have joshed you if I thought there might have been a grain of truth to the accusation

Re: having the conversation - indeed if each wine had won ~ 3 or 4 times, then we very much wouldn't have had the conversation - which is the risk in performing a test on data that seems to paint an unusual picture. Gathering the data for future tastings would be very interesting though.

regards

Ian

by **Ian Sutton** » Wed Sep 23, 2009 3:10 pm

Jenise wrote:This sure has this been interesting.

by **Mark Lipton** » Wed Sep 23, 2009 5:12 pm

Brian Gilp wrote:20 samples are not significant for random events. If one is to flip a coin 20 times it is not uncommon to have it come up heads 15 times.

Actually, that statement is untrue by most definitions of "uncommon." 20 coin flips will result in 15 heads or 15 tails no more than 1.5% of the time, hardly what I'd term a common occurrence. In fact, the chances of getting 0-5 heads or tails amounts to no more than 4.2% of the time or once out of every 24 attempts, on average. Getting back to Jenise's "experiment," 20 tastings is enough in my opinion to draw some fairly strong conclusions.

Mark Lipton

by **MichaelB** » Thu Sep 24, 2009 12:41 am

Jenise, you may be encouraging bias by the way you identify the wines. As I understand it, each taster identifies the bottle by the letter affixed to it, A through F. All of your tasters went to school, and everyone knows that A is excellent, B above average, C meh, D below average. We don’t use E, but F is FAILURE. And it’s not just school grades—wouldn’t you prefer A to B in just about any arena? I bet you’d get different results using GHIJKL or other symbols like # to identify the bottles.

by **Brian Gilp** » Thu Sep 24, 2009 8:17 am

Mark Lipton wrote:
Brian Gilp wrote:20 samples are not significant for random events. If one is to flip a coin 20 times it is not uncommon to have it come up heads 15 times.

Actually, that statement is untrue by most definitions of "uncommon." 20 coin flips will result in 15 heads or 15 tails no more than 1.5% of the time, hardly what I'd term a common occurrence.

Concur that uncommon was a poor choice of words and the 1.5% is not common. In the context that I was thinking which is that larger than a single occasion of 20 flips of the coin, for which Ian explained better than I could, where there is multiple occasions of 20 flips each the likelihood that one will witness 15 heads out of 20 flips becomes much greater than if witnessing only a single occasion. The problem is that it is impossible to know when witnessing just one occasion if that occasion is the 15 heads uncommon occasion or if it represents a real bias.

» WLDG - WineLovers Discussion Group » The Wine Forum

Ranking and Preferences

Forum Menu

Re: Ranking and Preferences

Re: Ranking and Preferences

Re: Ranking and Preferences

Re: Ranking and Preferences

Re: Ranking and Preferences

Re: Ranking and Preferences

Re: Ranking and Preferences

Re: Ranking and Preferences

Re: Ranking and Preferences

Re: Ranking and Preferences

Re: Ranking and Preferences

Re: Ranking and Preferences

Re: Ranking and Preferences

Re: Ranking and Preferences

Re: Ranking and Preferences

Forum Menu

Who is online

Forum Menu

About Us

Useful Links

Our Partners

Get Social