Counterintuitive Statistics
Uncommon Sense
Everyone relies on common sense and intuition to make decisions in life. Usually our intuition about a situation serves us well in decision making, allowing us to quickly reach decisions in complicated situations without spending a great deal of time analyzing every aspect of the situation in detail. Unfortunately, however, there are times when our intuition is badly mistaken. In fact, there are many situations were almost anyone's intuition would lead them to the wrong conclusion. Fortunately we have the branch of mathematics known as statistics to help us in those tricky situations. This article explores some of those tricky situations, examines why your intuition will probably lead you to the wrong conclusion, and explains how to reach the correct conclusion.
The Monty Hall Problem
In all of statistics, no problem is more famous or counterintuitive than the Monty Hall problem. Although the statistics behind the problem are almost painfully simple, the answer is so counterintuitive that many people stubbornly insist that it can't possibly be right, even when confronted with mathematical proof. The problem is as follows:
Suppose you are a contestant on a game show in which you are confronted with three doors. Behind one door is a fabulous prize, while behind the other two doors there is nothing. Your task is to pick the door with the prize. The host allows you to pick a door. After you choose a door, but before you get to see what's behind it, the host will open one of the other two remaining doors that doesn't have a prize behind it, eliminating that door from the situation. You are then given the option to either stick with your initial choice of door, or switch to the other remaining door. What are your odds of winning the prize if you choose to stay with your current door? What are your odds of winning if you choose to switch to the other door?
Most people intuitively think that the odds of winning the prize are 1/2 regardless of whether you switch; after all, there are only two doors and one of them has the prize. But in fact if you stay with your initial door you will only have a 1/3 chance of winning. By switching doors, your chances of winning the prize increase to 2/3. Thus switching doors is always the best choice.
When you choose your initial door, there is a 1/3 chance that you picked the door with the prize. That's doesn't change just because the host eliminates one of the losing doors; even after one door is eliminated and you have only two doors to choose from, there is still only a 1/3 chance that you initially picked the correct door. Thus the odds that the other remaining door has the prize are 2/3.
The Testing Problem
Suppose there is a rare disease that randomly affects 1 out of every 100,000 people. Although you know it isn't likely that you would have the disease, you want to be sure. Fortunately there is a test available for the disease that is accurate 99.9% of the time. In other words, there is a 1 in 1000 chance that the test will falsely tell a healthy person that they have the disease, and a 1 in 1000 chance that it will falsely tell a sick person that they do not have the disease. You decide to take the test, and unfortunately the result comes back positive - indicating that you have the disease. What are the odds that the test was correct and you have the disease? What are the odds that the test was wrong and you don't have the disease?
Most people would intuitively assume that since the test is correct 99.9% of the time, there is a 99.9% chance that they have the disease and only a 0.1% chance that the result was a false positive. Fortunately for you, there is actually a 99% chance that the result was a false positive - even though the test is accurate 99.9% of the time.
To understand why, you must remember that there are two ways you could get a positive result on the test. The first way is for you to have the disease, which is a 1 in 100,000 chance. The other way is for the test to give a false positive result, which is a 1 in 1000 chance. Thus if the test gives a positive result, it is 100 times more likely that it is a false positive than a true positive - even though the test is 99.9% accurate.
This bit of counterintuitive statistics has very important implications for real-life laboratories that test for diseases, especially when the disease that they are testing for is rare. Such labs must always do a test multiple times to ensure that a result isn't a false positive, even when the test being used is very accurate. If the test in the above scenario was performed twice and came back positive both times, there is still a 10% chance that you don't really have the disease, since the odds of two false positives in a row are 1 in a million while the odds of actually having the disease are 1 in 100,000. If the test were performed three times and came back positive each time, there would be a 99.99% chance that the positive result was accurate.
The Boy or Girl Problem
Suppose you run into a friend who have have not seen in many years. While discussing what's been going on in your lives since you last met, your friend says that she has given birth to non-identical twins. You ask your friend "Is one of the twins a boy?" and she answers "Yes". You then ask "Is the other twin a girl?" What are the odds that the person will answer yes, the other twin is a girl? What are the odds that the person will answer no, and tell you that the other twin is also a boy? Most people would assume that since the odds of each child being a boy or girl are 1/2, there would be a 1/2 chance that the other child would also be a boy. In fact, there is a 2/3 chance that the other child is a girl.
Once again, to understand this counterintuitive answer one must look at what possibilites could lead to each answer, and how likely each possibilty is. Let's label the two twins A and B. There are 4 equally-likely possibilites:
- A is a boy and B is a boy (probability 1/4)
- A is a boy and B is a girl (probability 1/4)
- A is a girl and B is a boy (probability 1/4)
- A is a girl and B is a girl (probability 1/4)
Since the mother has already told us that one child is a boy, we know that the 4th option isn't correct. That leave us with possibilities 1-3, each of which are equally likely. But only one of those possibilites (number 1) results in both children being boys, while two (2 and 3) result in one child being a boy and the other a girl. Thus there is a 2/3 chance that the mother will answer "Yes, the other twin is a girl."
Interestingly, if you change your question from "Is one of the twins a boy?" to "Was the first-born twin a boy?" the odds of the other twin being a girl drop down to 1/2. This is because asking about a specific twin (rather than both twins at the same time) allows us to eliminate two options rather than just one, and the two remaining options are equally likely.