[Gramene] Re: [refseq-admin] Fwd: GROUP: RefSeq and TPA - help please [NCBI tracking system #15337316]

Brian Smith-White smtwhite at ncbi.nlm.nih.gov
Wed Sep 26 11:42:32 EDT 2007


RT - Molly Craxton wrote:
> ------ MESSAGE BODY. YOU MAY CHANGE IT OR ADD COMMENTS ABOVE ------
> 
> Original Cc line: molly at mrc-lmb.cam.ac.uk
> Original To line: plantgdb at iastate.edu, helpdesk at ensembl.org, wormbase-help at wormbase.org, webmaster at jgi.doe.gov, curator at arabidopsis.org, mhasebe at nibb.ac.jp, hgsc-help at bcm.edu
> 
> Dear all, I am sending this to every address I could find right now,  
> dealing with organisms I have looked at (I couldn't find an email  
> address for flybase).
> 
> This is just to let you know that I shall proceed with an attempt to  
> annotate all of the genes in the paper (please see the full text link  
> below) using TPA (either via EMBL/EBI or NCBI). I shall begin this in  
> two weeks time.
> 
> After successful TPA, I shall approach individual organism databases  
> (eg. TAIR) with particular corrections/annotations.
> 
> I am concerned about the issue of nomenclature and particularly whether  
> my suggested nomenclature for the plant genes will be suitable, so  
> would appreciate feedback on this topic as early as possible.
> 
> I hope that this effort will help to integrate a whole lot of disparate  
> efforts and move things forwards. If the TPA annotations are not easily  
> visible to (at least) all the relevant organism databases, then I am  
> dubious about the value of going to this effort (I am a bench scientist  
> with a solo, part-time interest in genes and evolution) - so I would  
> appreciate feedback about that.
> 
> Please contact me at molly at mrc-lmb.cam.ac.uk
> 
> Yours sincerely, Molly Craxton
> 
Dear Dr. Craxton,

I work with plant data at NCBI. There are two questions here: a) the 
correct nomenclature and b) how to create and/or modify the public 
information at NCBI.

For those plant research communities with their organism-specific 
databases, you should discuss the nomenclature with them. NCBI will 
defer to their decisions unless there is a conflict. TAIR has already 
responded for Arabidopsis. For Avena sativa, Hordeum vulgare, and 
Triticum aestivum contact curator at pw.usda.gov. For Zea mays contact 
MaizeGDB through Carolyn Lawrence at triffid at iastate.edu. For rice 
contact RAP-DB at rap-db at lab.nig.ac.jp or Gramene at 
gramene at gramene.org. For solanaceae (tomato, potato, Nicotiana) contact 
SGN at sgn-feedback at sgn.cornell.edu. For grape contact 
Mark.R.Thomas at csiro.au. For rosaceae (apple) contact GDR at 
dorrie at wsu.edu or aalbert at clemson.edu. For poplar contact the IPGC at 
ipgc at ornl.gov. For soybean contact SoyBase through David Grant at 
dgrant at iastate.edu. For Physocomitrella contact PHYSCObase through Dr. 
Hasebe at mhasebe at nibb.ac.jp.

For those plants without an organism-specific database you are the 
authority.

Modification of the information at NCBI has two components: a) organisms 
with material in Entrez Gene and b) organisms not in Entrez Gene.

For the organisms not in Entrez Gene, a TPA is the only way. A TPA can 
reference any sequence record at NCBI that is part of the international 
collaboration - Trace, core nucleotide, collection of ESTs, .... .  A 
TPA may not reference a RefSeq accession. From additional file 2 these 
organisms are Physcomitrella patens, Tortula ruralis, Selaginella 
lepidophylla, Adiantum capillus-veneris, Ceratopteris richardii, Cycas 
rumphii, Ginkgo biloba, Welwitschia mirabilis, Cryptomeria japonica, 
Picea glauca, Pinus taeda, Pseudotsuga menziesii, Amborella trichopoda, 
Nuphar advena, Liriodendron tulipifera, Persea americana, Acorus 
americanus, Agrostis stolonifera, Allium cepa, Ananas comosus, Asparagus 
officinalis, Avena sativa, Brachypodium distachyon, Crocus sativus, 
Curcuma longa, Cynodon dactylon, Eragrostis tef, Festuca arundinacea, 
Leymus chinensis, Lilium longiflorum, Lolium temulentum, Lycoris 
longituba, Musa acuminata, Panicum virgatum, Pennisetum glaucum, 
Saccharum officinarum, Sorghum bicolor, Triticum aestivum, Zea mays, and 
Zingiber officinale. From the 6 figures these organisms are Malus, 
Nicotiana, Solanum tuberosum, Lactuca, Medicago, Gossypium, Populus, 
Aquilegia, and Citrus.

For the organisms in Entrez Gene there are two groups: those with genome 
sequence (rice, Arabidopsis, grape) and those without (wheat, corn, 
tomato, soybean, barley). Genome sequence is used in the more 
restrictive connotation of assembled, annotated sequence covering all of 
the chromosomes. For the group with genome sequence, the dataflow is 
from the organism-specific database to NCBI with NCBI adjusting the NCBI 
information to conform to the submission from the organism-specific 
database. Contact the organism-specific databases to change the 
information at NCBI. The time for the information in NCBI to change 
might be longer, but all sites will remain synchronized.

For the organisms with records in Entrez Gene but no genome sequence 
there are two possibilities. Which possibility depends upon the sequence 
records referenced. Records in Entrez Gene require at least 1 non-EST 
record. For those proteins satisfying this send me the necessary 
information and I can add the BMC reference and the protein 
classification to the record for the gene. Necessary information 
consists of which Entrez Gene, the sequence record(s) and the protein 
classification. Otherwise a TPA record is the only mechanism to make the 
information in the paper available for computational examination.

Brian Smith-White
Staff Scientist
National Center for Biotechnology Information
National Library of Medicine
National Institutes of Health
8600 Rockville Pike
Bldg 38 Room 6S614-L
Bethesda, MD 20894
Voice: 301.594-2274


>> ---------------------------- Original Message  
>> ----------------------------
>> Subject: [refseq-admin] Fwd: GROUP: RefSeq and TPA - help please [NCBI
>> tracking system #15337316]
>> From:    "RT - Kim Pruitt" <rt at ncbi.nlm.nih.gov>
>> Date:    Tue, September 18, 2007 8:06 pm
>> To:      molly at mrc-lmb.cam.ac.uk
>> ----------------------------------------------------------------------- 
>> ---
>>
> 
>> Dear Dr. Craxton,
>> I agree that GeneRIF is not the right mechanism to make connections to  
>> your
>> publication as many of the species you list are not included in Entrez  
>> Gene.
>> I think TPA may be your best approach but you will have to check their
>> submission requirements. There is some information about TPA available  
>> online
>> at http://www.ncbi.nlm.nih.gov/Genbank/TPA.html
>>
>> Sincerely,
>> Kim Pruitt
>> RefSeq
>>
>> molly at mrc-lmb.cam.ac.uk wrote (Fri, Sep 14 2007 05:24:29):
>>
>>> ======== MIME MESSAGE (resent unchanged to recipients)
>>> molly at mrc-lmb.cam.ac.uk
>>>
>>>   =============== PART 1
>>>   Hello, please see the below and the attached. I shall go ahead with
>>>   an attempt to do the annotation via TPA (much as the prospect seems
>>>   difficult in my current circumstances) but you might want to have a
>>>   look at these sequences as well. There is no functional data, so I
>>>   can't contribute anything via GeneRIF. I do know about that as I  
>>> have
>>>   already contributed some GeneRIFs. I think I have read all the
>>>   relevant pages (such as those mentioned below) about data submission
>>>   etc. so now I just want to get on with it. I would like to end up
>>>   with a situation where someone who (for whatever reason) hits one of
>>>   these sequences (by a blast search for instance) can link to the
>>>   paper and to all of the associations described therein. Are the TPA
>>>   sequences visible in this way? (I've never hit any) If not, the
>>>   effort of doing the TPA submission may not be worthwhile.
>>>
>>>   Please get in touch if you can say or do anything to aid my progress
>>>   in this gene annotation project.
>>>
>>>   here's a link to the full text version of the paper
>>>   http://www.biomedcentral.com/1471-2164/8/259
>>>
>>>   May I draw your attention to pages 13-16 of the pdf - that's the  
>>> part
>>>   which talks about annotation.
>>>
>>>   Thank you kindly, Molly Craxton
>>>
>>>   Begin forwarded message:
>>>
>>>> From: "Liu, Hanguan (NIH/NLM) [C]" <hliu at ncbi.nlm.nih.gov>
>>>> Date: 13 September 2007 19:31:19 BDT
>>>> To: <molly at mrc-lmb.cam.ac.uk>
>>>> Cc: "Liu, Hanguan (NIH/NLM) [C]" <hliu at ncbi.nlm.nih.gov>
>>>> Subject: Re: GROUP: RefSeq and TPA - help please
>>>>
>>>> Dear Colleague,
>>>>
>>>> You can submit your data to TPA. For TPA submission, see:
>>>> http://www.ncbi.nlm.nih.gov/Genbank/TPA.html
>>>>
>>>> You can also submit GeneRIFs ( Gene References Into Function). For
>>>> example,
>>>> http://www.ncbi.nlm.nih.gov/projects/GeneRIF/GeneRIF.cgi?
>>>> lid=395845&sym=
>>>> JAK3&name=Janus+kinase+3+(a+protein+tyrosine+kinase,+leukocyte)
>>>> &org=Gall
>>>> us+gallus&url=%2Fentrez%2Fquery.fcgi%3Fdb%3Dgene%26cmd%3DRetrieve%
>>>> 26dopt
>>>> %3Dfull_report%26list_uids%3D395845
>>>>
>>>> For RefSeq annotation, if you are interested, please contact the NCBI
>>>> RefSeq group at
>>>> refseq-admin at ncbi.nlm.nih.gov
>>>>
>>>>
>>>> See: http://www.ncbi.nlm.nih.gov/Genbank/tpafaq.html#refdiff
>>>>
>>>>
>>>>
>>>> Best regards,
>>>>
>>>> Hanguan Liu
>>>> NCBI User Services
>>>> ------------ Begin Forwarded Message -------------
>>>>
>>>> To: info at ncbi.nlm.nih.gov
>>>> From: Molly Craxton <molly at mrc-lmb.cam.ac.uk>
>>>> Subject: RefSeq and TPA - help please
>>>> Date: Thu, 13 Sep 2007 11:33:56 +0100
>>>>
>>>>  From Molly Craxton, MRC Laboratory of Molecular Biology, Hills Road,
>>>> Cambridge CB2 0QH U.K. molly at mrc-lmb.cam.ac.uk September 13th 2007
>>>>
>>>> Dear folks at NCBI, I hope you can set me in the right direction. I
>>>> attach my just published paper. The paper describes a number of gene
>>>> families which are widespread in eukaryotes. I would like the  
>>>> detailed
>>>> knowledge of these gene families which I have compiled and
>>>> classified in
>>>> this paper, to be accessible to people searching the sequence
>>>> databases.
>>>> I assume that this might be possible via TPA.
>>>> That seems to me, as a lone researcher working on this part-time,
>>>> to be
>>>> possibly a huge endeavour - but it is one I am willing to undertake
>>>> for
>>>> the sake of this data being usefully USED. Refseq is another
>>>> possibility, in which case I proffer this paper to the Refseq
>>>> curators.
>>>> Every detail is included in the supplementary data files.
>>>> And what about other databases at NCBI, such as the Gene database?
>>>>
>>>> As I work in the UK, I have always submitted my sequences to EMBL/
>>>> EBI,
>>>> but I much prefer using the facilities offered by NCBI. The thought  
>>>> of
>>>> trying to do this annotation attempt via EMBL/EBI is a very scary
>>>> prospect, so I really hope that you folks can guide me forwards here.
>>>>
>>>> As well as attempting to annotate the primary sequences at NCBI, via
>>>> Refseq or TPA, I also intend to approach all of the organism  
>>>> databases
>>>> with this work too, so that they can make good use of it. I thought I
>>>> would talk to you first though, to get an idea of how to prioritise
>>>> this
>>>> job and how best to organise my time.
>>>>
>>>> I would enormously appreciate your help. I hope you can have a read
>>>> through the paper, and come back to me with your suggestions.
>>>>
>>>> Yours sincerely, Molly Craxton
> 
> ---- END OF MESSAGE BODY.  PLEASE DO NOT CHANGE THE DATA BELOW ----
> SK#:367:4346:109:203:2387064
> 
> Please leave the subject line unchanged, and do not change the message
> at end from the line with "END OF MESSAGE BODY" to the end.
>  
> The following summarizes the status of your request:
> Current status of the ticket #15337316 is:
>        Queue: refseq-admin
>         Area: 
>      Subject: Fwd: GROUP: RefSeq and TPA - help please
>        Owner: pruitt
>   Requestors: molly at mrc-lmb.cam.ac.uk
>       Status: resolved
> 
>  Transaction: Comments added by 'molly at mollycraxt.demon.co.uk' (15870904)
> 
> ------- END OF TRNSACTION DATA.  YOU MAY ADD COMMENTS BELOW -------



More information about the Gramene mailing list