# Bayes' Theorem - often used in disease calculations.

## Bayes' Theorem

## What is Baysian reasoning?

Bayesian
reasoning (conditional probability) is an application of statistics which allows the experimenter to take into consideration existing results. For some reason applying the theorem is *counter-intuitive. *In some situations it is essential, and in some situations it is not applicable. Non conditional statistical reasoning is called *frequentist*.

Let's assume that **S** goes to the doctor to be tested for a disease. The doctor knows from history that 1 in 100 people who match **S** in age and other characteristics develop the desease. The doctor tells **S** without examination, that there is a one in 100 chance that he has the disease. **S**
is not happy with this examination and requests a test. The doctor
knows that there is no test for this mystery disease, and tosses a
coin. "Heads.", he says, "You have a 1 in 100 chance of having this
disease."

The patient is a little disturbed. "You already said that! What kind of test is throwing a coin? What if it came down tails?"

The doctor states that it would mean a 1 in 100 chance of having the disease.

**S** is not happy. "I want you to do a real test! Something with big machines and needles and radio active stuff."

The doctor agrees but knows that there is no test for the disease and he does it to placate the patient.

After the test, the doctor tell the patient, "You have a 1 in 100 chance of having the disease."

Of course the patient is now very annoyed, but the doctor explained that there is no test for this disease, and whether he flips a coin or uses a very expensive and painful test, the result has 50% false-positive, and 50% false-negative.

Disgusted, the patient leaves and seeks a second and third opionion. All give various tests but all give the same answer.

Finally **S**
finds an experimental test which claims to be right 60% of the time if
the disease is present, and wrong 48% of the time when the disease is
not present.

He takes the test, and is told that it came out negative.

How likely is he to actually have the disease?

## Let's think this through.

The test is not a treatment. It not does it cause the disease and so it has not changed his chances of having the disease. However, the final test tells him how likely he is to be one of the "1 in 100" who have the disease.

### How do we work this out?

The statistic "1 in 100 has the disease" is prior knowledge. It's is something that is assumed to be correct based on a collection of previous experiments. The outcome of the test should modify that statistic specifically for the patient in question. If a test comes out positive and is 100% accurate, then clearly, this modifies the prior knowledge from "1 in 100" to "certain". But if the test comes out negative, and is incorrect 50% of the time, then it is not certain that the patient is free of the disease. In this case, the prior knowledge was '99 in 100' do not have the disease. How does the inaccurate negative result modify the prior knowledge?

## Some more reasoning

Since tests are flawed in almost every case, the outcome of the test is not a direct probability. All it tells us is how to modify our prior
knowledge. However, if a test is extraordinarily accurate, then we can ignore prior knowledge.

In the case of the "1 in 100" chance of having a
disease, a test is ineffective if it cannot discriminate between the likelihood of having it and not having it. The test would not
modify the prior knowledge. When the test actually has some value, then it can be combined and adjust the prior knowledge *within the context of that particular patient*.

## How good is this test?

Let's a apply Bayes' Theorem.

If we have two events where the disease is D and positive result from a test for it is T, then we can represent a negative test for D as (~D). The symbol ~ is often used to negate the truth of a statement. Given a binary event B , like 0 and 1 or True and False, B=(~~B). It's like saying, "I ain't not goin' down the shops".

(~T) means a negative result of the test. We use P as shorthand for 'probability' ( and sometimes P_{r} when it is clearer or a matter of style). The vertical bar | means 'given'.

P(D|~T) therefore reads "The probability of having the disease given a negative test result." This is what we would like to find.

Apply Bayes'

P(D | ~T) = P(~T | D) P(D) ------------------------------------ P(~T | D) P(D) + P(~T |~D) P(~D)

P(~ T | D) = 0.40 (test is negative when **S** is diseased in other words the test is wrong when **S** has the disease).
P(~ T | ~ D) = 0.48 (test is negative but **S** is healthy)

P(D) = 0.01 (All prior knowledge says that 1 in 100 for **S**'s profile have the disease)

We can deduce the following:

P(~D) = 1 - 0.01 = 0.99 P(D|~T) = (0.40)( 0.01) --------------------------- ≈ 0.008 = 0.8%

(0.40)(0.01) + (0.48)(0.99)

The doctor tells his patient, that he as *about* a 1 in 100 chance of having the disease. **S** is disgusted."You doctors are all the same!"

Of course, the test is not very conclusive because the disease is rare, and the test is unreliable. Performing the test does not modify the prior knowledge much. Things are different when a test is found that has better accuracy.

## S finds a better test.

As time goes on, **S** finds a better test. This test is 93% accurate when the disease is present. It is 85% accurate when the disease is not present.

How likely is **S** to actually have the disease after he takes this test and it comes out positive?

What should he think if it comes out negative?

You can, if you wish, leave your answer and reasoning in the comments section.

## Links

- Bayes' Rule

Here is a simple introduction to Bayes' rule from an article in the Economist (9/30/00) - Math Forum - Ask Dr. Math

I am going to be taking a statistics course this semester and noticed a chapter called "Bayesian Statistics." What is the difference between this and "regular" statistics?

## Comments

[NOTE 5/8/12: THE 0.48 FIGURE IN THE DEMONINATOR IS PROBABLY WRONG. 0.48 IS THE PERCENT OF POSITIVE TESTS IN PEOPLE WITHOUT THE DISEASE (I.E., IN THAT GROUP, THE TEST IS WRONG 48% OF THE TIME); WHAT THE RIGHT SIDE OF THE DENOMINATOR CALLS FOR IS THE PERCENTAGE OF NEGATIVE TESTS IN THOSE WITHOUT THE DISEASE, WHICH WOULD BE 0.52. IN OTHER WORDS, THE DENOMINATOR SHOULD BY THE SUM OF THOSE WITH DISEASE WHO HAVE NEGATIVE TESTS PLUS THOSE WITHOUT DISEASE WHO HAVE NEGATIVE TESTS. BUT FROM WHAT IS PRESENTED ABOVE, 0.48 REPRESENTS THE PERCENTAGE OF THOSE WITH WITHOUT DISEASE WHO HAVE POSITIVE TESTS. BECAUSE 0.48 AND 0.52 ARE CLOSE, IT WON’T CHANGE THE RESULT MUCH, BUT IT’S STILL INCORRECT. IT'S PROBABLY JUST A CLERICAL ERROR BUT PERHAPS THE AUTHOR CAN FIX IT.]

Bookmarked! Thanks!