[Gramene] rice protein sequences
Will Spooner
wspooner at cshl.edu
Wed Jun 17 16:50:00 EDT 2009
On 17 Jun 2009, at 18:14, Peifen Zhang wrote:
> Hi,
>
> I am trying to retrieve rice protein sequences at Gramene, i.e.
> LOC_Os01g01302. When I do a quick search for LOC_Os01g01302, it
> returned two proteins, Q0JRG4 and Q655K8, both referring to
> LOC_Os01g01302 (http://www.gramene.org/db/searches/quick_search?category=&search_for=LOC_Os01g01302&x=0&y=0
> ).
>
> My question is why there are two proteins and two sequences for
> LOC_Os01g01302? Is that because they are both best BLAST hits
> meeting a certain cut-off?
Hi Peter,
Q0JRG4 and Q655K8 are both UniProtKB/TrEMBL proteins. These are
automatically annotated from EMBL nucleotide sequences, and are fairly
redundant. The mappings between UniProtKB/TrEMBL to rice genes are
based on best-in-genome alignment, and it is common for multiple
TrEMBL proteins to map to a single gene.
>
> Finally, is rice-proteins.fa.zip in ftp://ftp.gramene.org/pub/gramene/fasta/
> the correct file to retrieve rice protein sequences (for japonica
> group)?
This really depends on what you are looking for. If you want the set
of rice proteins in Gramene that correspond to UniProtKB proteins,
then this is the best file. If you want proteins that correspond to
predicted cDNAs on the rice genome (LOC_ IDs), then this file may
serve you better;
ftp://ftp.gramene.org/pub/gramene/release29/data/fasta/oryza_sativa_japonica/pep/Oryza_sativa_japonica.TIGR5.52.pep.all.fa.gz
All the best,
Will
>
> Thanks and regards,
> Peifen
> _______________________________________________
> Gramene mailing list
> Gramene at brie4.cshl.edu
> http://mail.gramene.org/mailman/listinfo/gramene
More information about the Gramene
mailing list