[Ngasp-help] Fwd: nGASP manuscript
Dr. Tristan J. Fiedler
fiedler at fit.edu
Wed Jul 16 17:06:25 EDT 2008
Begin forwarded message:
From: Steven Salzberg <salzberg at umd.edu>
Date: July 16, 2008 3:54:04 PM EDT
To: "Dr. Tristan J. Fiedler" <fiedler at fit.edu>, alc at sanger.ac.uk, lstein at cshl.edu
, Steven Salzberg <salzberg at umiacs.umd.edu>
Subject: Re: nGASP manuscript
hi Tristan, Avril, and Lincoln,
Thanks for sending the manuscript. I've not had time to read it
closely yet, but I noticed one problem that
I want to point out. This is an error that many of us (including me)
in the gene-finding community have
made before, but now that I know about it I want to avoid it in the
future.
The problem is our use of the term "specificity." The way you (we)
have used it in the manuscript follows
the usage in the EGASP competition, which also got it wrong. Our
definition in EGASP was the percentage
of a gene finders' predictions that were correct; i.e.:
(# correct predictions)/(total # predictions)
However, a good friend and colleague of mine (a biostatistician)
pointed out that this measure should instead
be called "precision." You can find standard definitions of
sensitivity and specificity in any text, and also in Wikipedia:
http://en.wikipedia.org/wiki/Sensitivity_and_specificity
The proper definition of "specificity" is the ratio of true negatives
to all "negative" predictions. This isn't
really meaningful in our context, because we don't attempt to predict
non-gene regions. (Another way to
look at this is that we aren't taking a putative gene and saying yes/
no.) In fact, we don't even have a good
way to say with certainty that a region isn't a gene, so we just look
at positive predictions.
The other term for what we're measuring is "positive predictive
value" (PPV):
http://en.wikipedia.org/wiki/Positive_predictive_value
although I like "precision" better. I think you'll agree that this is
what the EGASP competition was
calling "specificity" - and it's been used this way in previous papers
too. But this definition is
quite confusing to statisticians, and I think we should revert to the
standard usage.
A simple global replace of "specificity" with "precision" will
probably fix the manuscript,
though it would be best to check carefully. I hope you'll agree.
Steven
Steven L. Salzberg, Ph.D.
Horvitz Professor of Computer Science
Director, Center for Bioinformatics and Computational Biology
3125 Biomolecular Sciences Building
University of Maryland, College Park, MD 20742
Phone: 301-405-9611
Email: salzberg at umd.edu
Blog: http://genefinding.blogspot.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://brie4.cshl.edu/pipermail/ngasp-help/attachments/20080716/c78cdd2e/attachment.html>
More information about the Ngasp-help
mailing list