[Fwd: Redundant sequences !]

Pankaj Jaiswal pj37 at cornell.edu
Thu Dec 5 18:11:18 EST 2002


FYI
Pankaj

-------- Original Message --------
From: Tao Tao <tao at ncbi.nlm.nih.gov>
Subject: Redundant sequences !
To: pj37 at cornell.edu
CC: tao at ncbi.nlm.nih.gov

Hi,

None of the primary sequence databases are non-redudant.
So there will be redundant records.  Even if the sequences
are the same, they may contain different annotations with
additional information and reference that may be useful.

The two entries are from different sources, one
from swissprot and the other from the translation
of and CDS annotation of an existing nucleotide record. 

For BLAST, we do collapse sequences with 100% identity into
one record.  This practice is limited to blast databases
only.  

REgards,

Tao Tao
NCBI User SErvice

------------- Begin Forwarded Message -------------

Date: Thu, 05 Dec 2002 14:51:08 -0500
From: Pankaj Jaiswal <pj37 at cornell.edu>
X-Accept-Language: en,Hindi
MIME-Version: 1.0
To: info at ncbi.nlm.nih.gov
CC: Gramene at gramene.org
Subject: Redundant sequences !
Content-Transfer-Encoding: 7bit
X-Virus-Scanned: by amavisd-milter (http://amavis.org/)
X-Virus-Scanned: by amavisd-milter (http://amavis.org/)
X-Filter-Version: 1.8 (mail-blade4)
X-Spam-Status: No, hits=2.4 required=5.5 tests=PLING,FROM_ENDS_IN_NUMS 
version=2.01

Hi,

Why do I see two entries for the same thing? I guess GenBank needs to weed out
such entries which are the same but considered different since they come from
two different database sources, DBJ and SP. It creates more confusion for the
user on which is what. More so as in this particular instance the
name/description has an ambiguity.

1: BAA21547 
RFL [Oryza sativa]
gi|2274790|dbj|BAA21547.1|[2274790]

2: O24175 
Putative transcription factor FL (RFL)
gi|7227890|sp|O24175|FL_ORYSA[7227890]


Pankaj

******************************************
Pankaj Jaiswal, Ph.D.                                   
Postdoctoral Associate
Dept. of Plant Breeding                             
Cornell University                                   
Ithaca, NY-14853, USA   

Tel:+1-607-255-3103 / Fax:+1-607-255-6683
E mail: pj37 at cornell.edu
http://www.gramene.org   
******************************************


------------- End Forwarded Message -------------



------------- End Forwarded Message -------------



More information about the Gramene mailing list