Fancy your chances?

64 posts / 0 new
Last post
Nyarlathotep's picture
Actually the answer is there

Actually the answer is there is about a 16% chance Joe actually has the disease.

Jeff Vella Leone's picture
explain that to me, how could

explain that to me, how could the test be so wrong if it is 95% accurate?

Jeff Vella Leone's picture
It cannot matter how many

It cannot matter how many people are sick in the population, if you know the result, that he is sick, there are no probabilities involved except the probability of the accuracy of the test itself.

Jeff Vella Leone's picture
I don't know what are you

I don't know what are you smoking

but if my doctor tells me that I am cancer free and the test is 95% accurate, I would fire the doctor if he says that just because he picked me from 1% of the population the chance of me being cancer free went down to 16% .

lol I mean seriously, this is not even funny anymore.

Travis Hedglin's picture
Last time my doctor took five

Last time my doctor took five tubes of blood, when I asked why he said:

"We run a bunch of different test, some multiple times, just to be absolutely certain."

Before that I never knew that they ran the tests more than once, but I suppose we can all see why now, lol.

Nyarlathotep's picture
Oh yeah, that makes sense.

Oh yeah, that makes sense. Especially if the tests are slightly different. Although you might find this disturbing: a survey of sorts was conducted on doctors where they were given a problem very similar to the one I posted and well, uhh... I guess you can read it for yourself. http://isites.harvard.edu/fs/docs/icb.topic988008.files/agoritsas2010.pdf

Jeff Vella Leone's picture
your doctor explained your

your doctor explained your ignorance of something, which is a completely different thing then contradicting logic itself.

Nyarlathotep's picture
The problem is we have 2

The problem is we have 2 competing ideas.

Idea 1 : It is likely Joe has the disease because the test is 95% accurate.
Idea 2 : It is likely Joe does not have the disease because the disease is rare.

Because these two ideas conflict it is going to be very difficult to estimate a reasonably good solution with just human intuition. Human intuition is notoriously bad for estimating probabilities to begin with. We must instead do the actual calculations.

F will represent having floobieitis.
T will represent a positive test result for floobieitis.

Probability of a positive test result, given you have floobieitis is P{T|F} = 0.95 ; this was given in the problem.
Probability of having floobieitis given you got a positive test result is P{F|T}; this is the answer to the problem (which we don't have yet).

The relationship between P{T|F} and P{F|T} is given by Bayes' theorem.
P{F|T} = (P{T|F}P{F})/(P{T|F}P{F}+P{T|F'}P{F'})

We know that P{T|F} = 0.95, given in problem

We know that P{F} = 0.01, given in problem

We know that the probably of a random person not having floobieitis, is just 1 minus the probably of a random person having floobieitis, or: P{F'} = 1 – P{F} = 1 – 0.01 = 0.99

We know that the probability of getting a positive test result without having floobieitis is just 1 minus the probability of getting a positive test result while having floobieitis, or: P{T|F'} = 1 – P{T|F} = 1 – 0.95 = 0.05

Substitute those values into the above equation and we will know the right hand side, so we know the left hand side, or: P{F|T} ≈ 0.1610 or about 16%.

The lesson of this example is that you shouldn't trust human intuition when it comes to probability, it will screw you over time and time again.

Travis Hedglin's picture
Nyarlathotep:

Nyarlathotep:

"The lesson of this example is that you shouldn't trust human intuition when it comes to probability, it will screw you over time and time again."

Hell, that is true of almost all advanced mathematics, she is cruel and nasty to our intuitions.

Jeff Vella Leone's picture
lol

lol

you mixed up the question, that is the problem

The problem when you don't use your head and rely on formulas alone is why you were wrong and described wrongly the question the first place.

Idea 1 : It is likely Joe has the disease because the test is 95% accurate.
Correct
Idea 2 : It is likely Joe does not have the disease because the disease is rare.
Totally wrong, this is an argument of ignorance, those 2 things are unrelated. You cannot use the formula and throw in what you like.
You must use logic and trust human intuition to make the relation.first, then use the formula.

let me explain to you what kind of absurdity you just did?
You are claiming that just because something is rare in the population(1%) then the medical test is less reliable.
You cannot seriously believe this?
This is like saying that a woman is less probable to be pregnant because the other 99 people in the room are not, even if she has a huge belly and her medial report that says she is pregnant at 95% accuracy.

Probability works only with choices/statistics not with detailed information like a medical record.

"Probability of having floobieitis given you got a positive test result is P{F|T}; this is the answer to the problem (which we don't have yet)."
yet you claimed that you had it; "administers his test and GETS a positive result."

This is not what you described, you clearly said that the result came positive already.(this made the population percentage irrelevant)
The question should have been, what is the probability of finding someone which the test will show that he has a disease else it won't make sens.
(just like how my logic predicted :) )

However logic is needed to define what goes in the formula, here is where I realized that this formula does not apply for this situation.
Why?
1% of the population has the disease, that is a probability on its own. This is the answer to how many have the disease.

95% accuracy of the test is an other probability that the test will get someone which has the disease.
This means that regardless of who is tested, 95 out of 100 people which are positive for the disease will be detected
And 5 out of 100 that do not have the disease will be detected.

So if there are just a 100 people and only 1 has the disease, you formula predicts that there is a 16% chance that that one will be picked and shown positive, or that 16 people will show positive.
In either case it won't make sens at all since then it won't be 95% accurate would it?
It never made because you are using the formula for the wrong situation.

You are applying a formula which does not make sens for the current problem.

Nyarlathotep's picture
I wrote a simulator for you

I wrote a simulator for you Jeff! Here are the results from the first 3 runs:

Number of people in population is 1000000
Number of people in population who actually have disease is 9917
Percentage of people in population who actually have disease is 0.9917%
Number of correct positives is 9426
Number of false positives is 49706
Ratio of correct positives to total positives is 0.159406074545
Percentage chance a positive result is correct 15.9406074545%
------------------------------
Number of people in population is 1000000
Number of people in population who actually have disease is 10055
Percentage of people in population who actually have disease is 1.0055%
Number of correct positives is 9544
Number of false positives is 49885
Ratio of correct positives to total positives is 0.160594995709
Percentage chance a positive result is correct 16.0594995709%
------------------------------
Number of people in population is 1000000
Number of people in population who actually have disease is 9929
Percentage of people in population who actually have disease is 0.9929%
Number of correct positives is 9434
Number of false positives is 49739
Ratio of correct positives to total positives is 0.15943082149
Percentage chance a positive result is correct 15.943082149%

You can examine the code and run it yourself, without any hard work, on codepad: http://codepad.org/pRH25OxC
Just make sure there is a check mark in "run code" and click submit on the form and the website will run the code and return the results to you (it takes a few seconds so be patient after you click submit, it has to iterate over a million item list a few times).

Jeff Vella Leone's picture
As I said, you cannot relate

As I said, you cannot relate the 2 things like that since they make no sens.

probability of eating crisp today 30%
probability of investment profit per month 5%

Work out what is the probability that when I eat a crisp I get a profit?

if you insert the numbers in the formula it will give a result but it won't make any sens since they are not related.

When making a question one must make sure that the 2 inputs are related.

Nyarlathotep's picture
Jeff - “You cannot use the

Jeff - “You cannot use the formula and throw in what you like.”

If you turn to page 16 of NASA's Probability and Statistics in Aerospace Engineering http://ntrs.nasa.gov/archive/nasa/casi.ntrs.nasa.gov/19980045313.pdf You will find a very similar problem (with just slightly different numbers), solved using Bayes' theorem!

If you check out http://vassarstats.net/bayes.html#comp it gives and example of how to use Bayes' theorem, and guess what, the example it gives is the problem I gave in this thread (with slightly different numbers)!

You also haven't addressed how the simulator managed to came to the same conclusion as the theoretical approach. If Bayes' theorem is not applicable (or just plain wrong), how did it managed to do this? Are you suggesting I made a different error in each calculation and they just happen to converge on the same value?

Jeff Vella Leone's picture
Nyar stop changing my claim

Nyar stop changing my claim to try to support your badly designed question.

You said that the result came in as a positive already, which means that it excludes the population out of the situation.
It only leaves the probability of the test result alone
which was 95% accuricy.

that was a trick question that has nothing to do with probability.

http://vassarstats.net/bayes.html#comp
Here it is dealing with:
the probability of a positive test result [B], irrespective of whether the disease is present [A] or not present [~A]
or/and
the probability of a negative test result [~B], irrespective of whether the disease is present [A] or not present [~A]

which is not the question you asked.

+ never said the formula was wrong, i said it was not related to your question.

Nyarlathotep's picture
Jeff, you stopped 1/2 way

Jeff, you stopped 1/2 way down the page... The very next box calculates P(A|B) which is the answer to the question. It says

P(A|B) = [P(B|A) x P(A)] / P(B)

Using the table you quoted:

P(B) = [P(B|A) x P(A)] + [P(B|~A) x P(~A)]

Substituting that you get:
P(A|B) = [P(B|A) x P(A)] /[P(B|A) x P(A)] + [P(B|~A) x P(~A)]

Boy that sure looks familiar, I wonder where we have seen that before... oh wait... I know:
Nyar- "The relationship between P{T|F} and P{F|T} is given by Bayes' theorem.
P{F|T} = (P{T|F}P{F})/(P{T|F}P{F}+P{T|F'}P{F'})"

If you replace A with F, and B with T, they are identical.

I'd also like to know why the simulation agreed so well with my calculation if the calculation is wrong...

Jeff Vella Leone's picture
the question is different

the question is different

you can keep ignoring this fact but you won't convince anybody that understands what we are talking about.

Nyarlathotep's picture
the question I asked "Jack

the question I asked "Jack picked Joe out of the population and administered the test to Joe. The result was positive. What is the probability that Joe has the disease"

From the website in question:
P(A|B) = the probability that the disease is present [A] if the test result is positive [B]

It is the same question Jeff.

Jeff Vella Leone's picture
"The result was positive."

"The result was positive."
means the result is chosen already, (you also confirmed this since i was in doubt and asked for confirmation)
There is no relevance to the 1% at all

"From the website in question:
P(A|B) = the probability that the disease is present [A] if the test result is positive [B]"
The probability is derived from a probability number and it is not dictated like your question where it has been chosen already.

basically:
here it is saying that out of the chance of getting the result as positive what is the chance that it actually is?
Answer given by formula

Your question is; when you have a positive result what is the chance that it is actually positive?
answer is the test result accuracy. 95% in this case.

Nyarlathotep's picture
Jeff - "There is no relevance

Jeff - "There is no relevance to the 1% at all"

That is known as the prosecutors fallacy. Believing that P(F|T) = P(T|F). Sometimes referred to as the false positive paradox:

http://en.wikipedia.org/wiki/False_positive_paradox- "The false positive paradox is a statistical result where false positive tests are more probable than true positive tests, occurring when the overall population has a low incidence of a condition and the incidence rate is lower than the false positive rate. The probability of a positive test result is determined not only by the accuracy of the test but BY THE CHARACTERISTICS OF THE SAMPLED POPULATION.” (emphasis mine)."

Travis Hedglin's picture
http://en.wikipedia.org/wiki

http://en.wikipedia.org/wiki/False_positive_paradox

I am reposting the link, it somehow got messed up for me.

Jeff Vella Leone's picture
"The probability of a

"The probability of a positive test result"
You did not provide a probability but you claimed that the result was a positive from the question.
If you gave a probability that it it will be positive then yes the formula would apply.

Nyarlathotep's picture
Jeff - "You did not provide a

Jeff - "You did not provide a probability but you claimed that the result was a positive from the question. If you gave a probability that it it will be positive then yes the formula would apply."

You are right in that I did ask exactly that. I asked what is the probability Joe has the disease GIVEN he was tested for the disease and the test was positive.

I calculated P{F|T} and claimed that was the answer. P{F|T} means: the probability F is present GIVEN the test was positive. That is the exact question I asked.

Also I wrote a simulation that shows that is the correct answer as well. Consider the first run:
The test generated 9426 positives results that were correct (9426 people tested positive for the disease, who actually had the disease).
The test generated 49,706 false positive results (49,706 people tested positive for the disease, who actually DIDN'T have the disease).
What is the probability that any given person who tested positive for the disease actually has the disease? Clearly that is just (number of correct positives)/(total number of positives) = 9,426/(9,426+49,706) ≈ 0.159406075 ≈ 16%. You have still to address the simulation...

I choose this problem for two reasons:
1) It is a very famous problem, that I've encounter in at least 3 math/statistics classes, so I already knew the answer.
2) The violence this problem inflicts on our intuition/common sense.

Jeff Vella Leone's picture
You have not learned how to

You have not learned how to do the questions but only how to answer one with a formula.
I answered it correctly without the formula because logic is needed to analyse the question first.

You neglected the first probability of choosing a positive but jumped strait to what happens after the positive test.
"GIVEN he was tested for the disease and the test was positive."

This means that there is no choosing from the population, no chance of getting someone that test negative.

"I calculated P{F|T} and claimed that was the answer. P{F|T} means: the probability F is present"
This means you are claiming that the new probability(P{F|T} will not include the chances of getting a different answer(or P(F)) like a negative.
See where the problem lies.
You have exclude (P (F)) (or P(A) and P(~A))out of the equation in this manner without knowing.
Go and ask your teacher about this and you will see that I am right.

And to show you how correct I was in the very first answer I gave which BTW was not what the question was asking but I guessed correctly what you could have asked.

"118/2000 = 59/1000

Which means that there is 59/1000 chance of the test showing that Joe has a disease, not that Joe actually has a disease."

59/1000 = 0.059= 5.9%
which is just the answer of P(B) if you put it in the formula.

What you wanted but described it incorrectly in the question is:
P(A|B)=the "PROBABILITY" that the disease is present [A] "IF" the test result is positive [B] (i.e., the probability that a positive test result will be a true positive)

"PROBABILITY" & "IF" the test result is positive.
If you claim that the test result IS positive not that it might be positive from a "PROBABILITY" then you have eliminated P(A) out of the equation.

[IMG]http://i59.tinypic.com/2s1ohzq.jpg[/IMG]
http://i59.tinypic.com/2s1ohzq.jpg

I derived that conclusion from some assumption which you failed to mention in the question like what is the:

P(B|A) the probability that the test will yield a positive result [B] if the disease is present [A]
P(~B|A) the probability that the test will yield a negative result [~B] if the disease is present [A]
P(B|~A) the probability that the test will yield a positive result [B] if the disease is not present [~A]
P(~B|~A) = the probability that the test will yield a negative result [~B] if the disease is not present [~A]

I assumed:
P(B|A) = 0.95 the probability that the test will yield a positive result [B] if the disease is present [A]
P(~B|A) = 0.05 the probability that the test will yield a negative result [~B] if the disease is present [A]
P(B|~A) = 0.05 the probability that the test will yield a positive result [B] if the disease is not present [~A]
P(~B|~A) = 0.95 the probability that the test will yield a negative result [~B] if the disease is not present [~A]

These should have been clear in the question because 95% accurate could mean only for P(B|A) and P(B|~A)

Then I assumed you are asking for what is the P(B) given the scarce information you gave and gave you the answer for that.

If you wanted P(A|B) you must remove the condition of knowing that "he was tested for the disease and the test was positive."

Like it is now(with that condition), only the probability of P(B|A) and P(~B|~A) will count and the answer is 95%

Travis Hedglin's picture
Once again, into the breech.

Once again, into the breech.

Given exactly a set of 101 people, with an accuracy rating of 95%, and around 1% having the affliction there is likely to be:

5 false positive
1 true positive

Get it yet? The number of false positives far exceed the number of true positives, making the probability of the positive being true less than the probability of it being false. By the way, did you know common drug testing only has around a 33% chance of a true positive?

CyberLN's picture
:-) Depends on the assay

:-) Depends on the assay being used. Some are far more reliable than others.

Nyarlathotep's picture
In both the screen shots you

In both the screen shots you provided of you using the conditional probability calculator, you can see the answer 16.1% on it for P(A|B) which was the exact question I asked. So far we have:

my calculation gave about 16%

your use of the calculator gave 16%

the simulator gave 16%

the Harvard statistics professor I took the question from (Joe Blitzstein) gave the answer as 16%

Penn State's STA 4321 website gives the answer as 16%:
http://www.personal.psu.edu/asb17/old/sta4321/files/sta4321-7.pdf

Penn State's STA 414 website gives the answer as 16%
https://onlinecourses.science.psu.edu/stat414/book/export/html/12

This mathematician says its 16%:
http://myreckonings.com/wordpress/2012/03/11/bayes-theorem-medical-diagn...

Conservipedia (eww) gives an excellent breakdown of the problem, again 16%
http://www.conservapedia.com/Bayes'_theorem

Here is an app for the android that you can use to solve the bayes' theorem problems, the example problem is our problem! and you guessed it 16%
https://play.google.com/store/apps/details?id=com.thenewboston.katherine...

Textbook of Diagnostic Microbiology says it is 16%
https://books.google.com/books?id=rdDsAwAAQBAJ&lpg=PA107&ots=k4gXKh0tje&...'%20theorem%20disease%2095%25%201%25%2016%25&pg=PA107#v=onepage&q=bayes'%20theorem%20disease%2095%25%201%25%2016%25&f=false

math website says answer is 16%
http://plus.maths.org/content/logic-drug-testing

Journal of Clinical Research Best Practices says it is 16%
https://firstclinical.com/journal/2011/1107_Biostatistics32.pdf

Pretty much everyone gets this problem wrong the first time Jeff. Rational people quickly realize why they were wrong when it is pointed out to them, and switch their answer to 16%. Clearly you are not one of those people. You are a crackpot, there is no other way to put it. I've suspected it for a long time; perhaps I was just in denial...

Jeff Vella Leone's picture
Well I tried to be reasonable

Well I tried to be reasonable and show you that your question was different then the questions of those you linked but sometimes people need to learn from their own mistakes.

Know that you were wrong and anybody that has some basic common sens will see it.

End of discussion

Travis Hedglin's picture
Nyarlathotep(Name error

Nyarlathotep(Name error corrected):

Given: 1% of the population has floobieitis. Jack has created a test for floobieitis that is 95% accurate. Jack picks Joe randomly out of the population and administers his test and gets a positive result. What is the probability Joe has floobieitis?

1% of population has X(floobieitis)
A test for X is 95% accurate
Joe is randomly tested
The result is positive
Probability question

http://www.personal.psu.edu/asb17/old/sta4321/files/sta4321-7.pdf
Example (Exercise 3.44):

A diagnostic test of a certain disease has 95% sensitivity and 95% specificity.
Only 1% of the population has the disease in question. If the diagnostic test
reports that a person chosen at random from the population tests positive, what
is the conditional probability that the person does, in fact, have the disease?

A test for X is 95% accurate
1% of the population has X
An unidentified individual is randomly tested
They test positive
Probability question

https://firstclinical.com/journal/2011/1107_Biostatistics32.pdf
Third Paragraph

Suppose a disease occurs in the general population with a probability of 0.01 (or 1%). The prior distribution is thus 1% sick and 99% healthy. In a routine blood panel, a patient tests positive for the disease. What is the probability the patient actually has the disease? The answer depends on the accuracy and sensitivity (true positive) of the test and also on the background (prior) probability of the disease. (See Table 1.) Let’s assume the sensitivity (true positive) of the test is 0.95 (or 95%), so the probability of a false negative is therefore 1.00-0.95 = 0.05 (or 5%). Let’s also assume the probability of a false positive test is 0.05 (or 5%), so the probability of a true negative is 1.00-0.05 = 0.95 (or 95%).

1% of the population has X
Random patient is tested
Test is positive
Probability question
Test is 95% accurate

I hate it when people try to make you say things you didn't say by only taking part of an argument and ignoring the qualifiers. Trying to say they didn't say something they obviously did is worse, you're right, you are done.

Travis Hedglin's picture
Nyarlathotep, if the test was

Nyarlathotep, if the test was run twice and the result was positive both times, how would we calculate the probability then?

Nyarlathotep's picture
Travis - "Nyarlathotep, if

Travis - "Nyarlathotep, if the test was run twice and the result was positive both times, how would we calculate the probability then?"

Interesting question. Allow me to change your question slightly:

It is known that 95% of people with floobieitiis have the chemical fakeitium in their blood, and 5% of people without floobieitis have fakeitium in their blood. Jack's test is a 100% reliable test for fakeitium. Which makes Jack's test 95% accurate for floobieitis.

It is also known that 95% of people with floobieitiis have doesnotexistium in their blood (and so on like above). Sue has a test for doesnotexistium, that leads to her test being 95% accurate for floobieitis.

We will also assume that false positives for doesnotexistium and fakeitium are independent (that is, that people who have 1 of those chemicals in their blood, who do not have the disease; are no more likely to have the other chemical than anyone else who does not have the disease). This is a subtle point, but very important.

Now we can calculate what the probability of having the disease, given someone tested positive with Jack and Sue's tests.

Start with the same logic, I switch notation for the complement to make it easier to read:
P{F|T} = (P{T|F}P{F})/(P{T|F}P{F}+P{T|~F}P{~F})

Now we will replace T with JS:
P{F|JS} = (P{JS|F}P{F})/(P{JS|F}P{F}+P{JS|~F}P{~F})

JS represents the situation where Jack and Sue's test came back positive

P{JS|F} = probability both tests return positive, given you have the disease; BECAUSE WE ASSUMED THE TESTS ARE INDEPENDANT: 0.95 *0.95 = 0.9025

P{F} = probability someone in the population has the disease, has not changed = 0.01

P{~F} = probability someone in the population does not have the disease, has not changed = 0.99

P{JS|~F} = probability both test return positive, given you DON'T have the disease; BECAUSE WE ASSUMED THE TESTS ARE INDEPENDANT: 0.05 * 0.05 = 0.0025

Plug-n-chug those and you get P{F|JS} ≈ 0.785 ≈ 78.5%

Boy that second test helped a lot!

And full disclosure, I ran this through a simulator and got the same answer, because I didn't want to look stupid...

Pages

Donating = Loving

Heart Icon

Bringing you atheist articles and building active godless communities takes hundreds of hours and resources each month. If you find any joy or stimulation at Atheist Republic, please consider becoming a Supporting Member with a recurring monthly donation of your choosing, between a cup of tea and a good dinner.

Or make a one-time donation in any amount.