asterroc: (xkcd - Fuck the Cosine)
asterroc ([personal profile] asterroc) wrote2009-06-01 09:02 pm

Small number statistics

I am looking for a reliable source to explain to a non-mathematician why drawing conclusions from a sample size of 3 is ridiculous.

[identity profile] rubicat.livejournal.com 2009-06-02 01:50 am (UTC)(link)


... you actually need a source to back that fact up?


[identity profile] zandperl.livejournal.com 2009-06-02 02:21 am (UTC)(link)
Considering that people are making hiring decisions based upon which boxes three individuals checked, yes.

[identity profile] gemini6ice.livejournal.com 2009-06-02 05:46 am (UTC)(link)
I would give the example:

There are 100 people in the room of unknown gender to you but either male or female (you cannot see into the room). You have one of them write his or her gender on a slip of paper and put them under the door. If it says "male," do you think that means that almost all the people in the room are male?

(They will likely agree it does not.)

Okay, what if you ask any two of them to write their gender on slips of paper and put them under the door and both say "male"? Is that enough to indicate that most people in the room are male?

(yes/no)

Continue until your student says yes. (I'm assuming you'll get a "yes" at 3 here.)

Reveal that the room was 50/50. There was a 25% chance that any three would have had the same gender. Show all eight M-M-F-style combinations.

[identity profile] sirroxton.livejournal.com 2009-06-02 01:20 pm (UTC)(link)
A proper analysis requires an acceptable defect rate, how willing you are to miss a worse defect rate, and how willing you are to get false positives. I can probably dig up an online sample size calculator if you're interested.

But if you want something simple you can convey, if the defect rate was as high as 15%(!), the odds of finding a defect after three checks is a meager:
1 - 0.85*0.85*0.85 = 39%

[identity profile] sirroxton.livejournal.com 2009-06-02 04:16 pm (UTC)(link)
Of course, if you're trying to identify newly introduced flaws in the manufacturing process itself, checking 3 boxes is perfectly reasonable.

[identity profile] hitchhiker.livejournal.com 2009-06-02 02:35 pm (UTC)(link)
nowadays, you can write a small simulation to prove your point. (i guess in the old days you could do something with cards too :))