[Gmod-help] Re: [Gmod-gbrowse] multi-segmented feature looks fine at low power, but connecting lines between segments disappear when zoomed in
Scott Cain
cain.cshl at gmail.com
Thu Sep 18 17:53:51 EDT 2008
Hi Burcu,
I just ran the GenBank record for NW_047331 through the BioPerl
bp_genbank2gff3.pl script and got perfectly acceptable GFF3 (I'll
paste a few lines below); did you get your GFF3 from running a BioPerl
script or somewhere else?
Thanks,
Scott
Here's what the first several lines of the GFF3 looked like:
##gff-version 3
# sequence-region NW_047331 1 1994762
# conversion-by bp_genbank2gff3.pl
# organism Rattus norvegicus
# date 22-JUN-2006
# Note Rattus norvegicus chromosome 10 genomic contig, reference
assembly (based on RGSC v3.4).
NW_047331 GenBank chromosome 1 1994762 . +
. ID=NW_047331;Alias=10;Dbxref=taxon:10116;Note=Rattus
norvegicus chromosome 10 genomic contig%2C reference assembly (based
on RGSC v3.4).;chromosome=10;comment1=Bio::Annotation::Comment%3DHASH(0x8a564e4);date=22-JUN-2006;mol_type=genomic
DNA;organism=Rattus norvegicus;strain=BN/SsNHsdMCW
NW_047331 GenBank gap 2014 3020 . + .
ID=GenBank:gap:NW_047331:2014:3020;estimated_length=1007
NW_047331 GenBank gap 5534 5583 . + .
ID=GenBank:gap:NW_047331:5534:5583;estimated_length=50
NW_047331 GenBank gene 49 13315 . + .
ID=Bfar;Dbxref=GeneID:304709,RGD:1304791;Note=Derived by automated
computational analysis using gene prediction method: BestRefseq.
Supporting evidence includes similarity to: 1 mRNA;gene=Bfar
NW_047331 GenBank mRNA 49 13315 . + .
ID=Bfar.t01;Parent=Bfar;Dbxref=GI:61557020,GeneID:304709,RGD:1304791;Note=Derived
by automated computational analysis using gene prediction method:
BestRefseq. Supporting evidence includes similarity to: 1
mRNA;exception=unclassified transcription
discrepancy;gene=Bfar;product=bifunctional apoptosis
regulator;transcript_id=NM_001013125.1
NW_047331 GenBank CDS 49 216 . + .
ID=Bfar.p01;Parent=Bfar.t01;Dbxref=GI:61557021,GeneID:304709,RGD:1304791;gO_component=integral
to plasma membrane%3B membrane fraction%3B ubiquitin ligase
complex;gO_function=structural molecule activity%3B ubiquitin-protein
ligase activity%3B zinc ion binding;gO_process=anti-apoptosis%3B
protein ubiquitination;codon_start=1;exception=unclassified
translation discrepancy;gene=Bfar;product=bifunctional apoptosis
regulator (predicted);protein_id=NP_001013143.1
On Thu, Sep 18, 2008 at 5:37 PM, Scott Cain <cain.cshl at gmail.com> wrote:
> Hi Burcu,
>
> I finally got around to looking at the sample data and config file you
> sent me a few days ago. There were a few problems I had to fix:
>
> 1. GFF3 and the Bio::DB::GFF adaptor doesn't always get along, and I
> think your GFF3 is one example of that happening. I switched to the
> Bio::DB::SeqFeature::Store adaptor.
>
> 2. With the switch to SeqFeature::Store, you don't need aggregators
> any more, so I switched the [EntrezGene] track to use gene:GenBank
> features and the gene glyph.
>
> 3. I changed all occurrences of 'iD' to 'ID' (I'm off to BioPerl next
> to see what caused this so I can make it stop).
>
> 4. I added a reference seqeunce line; you had a line like this:
>
> Chr10 GenBank region 1 1994762 . . . ID=NW_047331
>
> I changed it to this:
>
> Chr10 GenBank region 1 1994762 . . . ID=Chr10;Name=Chr10
>
> (of course, I suspect that rat chromosome 10 is bigger than that; the
> alternative would be to change column 1 to NW_047331 throughout the
> file.)
>
> Switching to SeqFeature::Store will require a few changes, but not too
> many; basically, the only things that would be affected are the tracks
> that currently use aggregators. Please let me know if you need any
> help with the transition.
>
> Scott
>
>
> On Mon, Sep 15, 2008 at 4:29 PM, Don Gilbert
> <gilbertd at cricket.bio.indiana.edu> wrote:
>>
>>
>> Burcu,
>>
>> It may be you have to work thru a few changes. The 'iD' problem likely was
>> part of it, your aggregator also needs to be updated with corrections for ID/Parent
>> tags.
>>
>>>> EntrezGene{CDS,exon/mRNA}
>>
>> This one should work when CDS,exon have Parent=mRNA.ID and mRNA has ID=
>> This is equivalent to the processed_transcript aggregator
>> Bio/DB/GFF/Aggregator/processed_transcript.pm
>>
>> Aggregators are good when using Bio/DB/GFF databases; the Bio/DB/SeqFeature/Store
>> databases do not use aggregators.
>>
>> PS, one of the tools, likely bp_genbank2gff3, created those funky 'iD' tags,
>> for reasons of its own.
>>
>> - Don Gilbert
>>
>
>
>
> --
> ------------------------------------------------------------------------
> Scott Cain, Ph. D. cain.cshl at gmail.com
> GMOD Coordinator (http://gmod.org/) 216-392-3087
> Cold Spring Harbor Laboratory
>
--
------------------------------------------------------------------------
Scott Cain, Ph. D. cain.cshl at gmail.com
GMOD Coordinator (http://gmod.org/) 216-392-3087
Cold Spring Harbor Laboratory
More information about the Gmod-help
mailing list