[Ngasp-help] Fwd: nGASP manuscript

Dr. Tristan J. Fiedler fiedler at fit.edu
Wed Jul 16 17:06:25 EDT 2008



Begin forwarded message:

From: Steven Salzberg <salzberg at umd.edu>
Date: July 16, 2008 3:54:04 PM EDT
To: "Dr. Tristan J. Fiedler" <fiedler at fit.edu>, alc at sanger.ac.uk, lstein at cshl.edu 
, Steven Salzberg <salzberg at umiacs.umd.edu>
Subject: Re: nGASP manuscript

hi Tristan, Avril, and Lincoln,
Thanks for sending the manuscript.  I've not had time to read it  
closely yet, but I noticed one problem that
I want to point out.  This is an error that many of us (including me)  
in the gene-finding community have
made before, but now that I know about it I want to avoid it in the  
future.
The problem is our use of the term "specificity."  The way you (we)  
have used it in the manuscript follows
the usage in the EGASP competition, which also got it wrong.  Our  
definition in EGASP was the percentage
of a gene finders' predictions that were correct; i.e.:
  (# correct predictions)/(total # predictions)
However, a good friend and colleague of mine (a biostatistician)  
pointed out that this measure should instead
be called "precision."  You can find standard definitions of  
sensitivity and specificity in any text, and also in Wikipedia:
http://en.wikipedia.org/wiki/Sensitivity_and_specificity
The proper definition of "specificity" is the ratio of true negatives  
to all "negative" predictions.  This isn't
really meaningful in our context, because we don't attempt to predict  
non-gene regions.  (Another way to
look at this is that we aren't taking a putative gene and saying yes/ 
no.)  In fact, we don't even have a good
way to say with certainty that a region isn't a gene, so we just look  
at positive predictions.

The other term for what we're measuring is "positive predictive  
value" (PPV):
  http://en.wikipedia.org/wiki/Positive_predictive_value
although I like "precision" better.  I think you'll agree that this is  
what the EGASP competition was
calling "specificity" - and it's been used this way in previous papers  
too.  But this definition is
quite confusing to statisticians, and I think we should revert to the  
standard usage.

A simple global replace of "specificity" with "precision" will  
probably fix the manuscript,
though it would be best to check carefully.  I hope you'll agree.

Steven

Steven L. Salzberg, Ph.D.
Horvitz Professor of Computer Science
Director, Center for Bioinformatics and Computational Biology
3125 Biomolecular Sciences Building
University of Maryland, College Park, MD 20742
Phone: 301-405-9611
Email: salzberg at umd.edu
Blog: http://genefinding.blogspot.com



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://brie4.cshl.edu/pipermail/ngasp-help/attachments/20080716/c78cdd2e/attachment.html>


More information about the Ngasp-help mailing list