[Gmod-help] Re: [Gmod-gbrowse] multi-segmented feature looks fine at low power, but connecting lines between segments disappear when zoomed in

Thu Sep 18 17:53:51 EDT 2008

Hi Burcu,

I just ran the GenBank record for NW_047331 through the BioPerl
bp_genbank2gff3.pl script and got perfectly acceptable GFF3 (I'll
paste a few lines below); did you get your GFF3 from running a BioPerl
script or somewhere else?

Thanks,
Scott

Here's what the first several lines of the GFF3 looked like:
##gff-version 3
# sequence-region NW_047331 1 1994762
# conversion-by bp_genbank2gff3.pl
# organism Rattus norvegicus
# date 22-JUN-2006
# Note Rattus norvegicus chromosome 10 genomic contig, reference
assembly (based on RGSC v3.4).
NW_047331       GenBank chromosome      1       1994762 .       +
 .       ID=NW_047331;Alias=10;Dbxref=taxon:10116;Note=Rattus
norvegicus chromosome 10 genomic contig%2C reference assembly (based
on RGSC v3.4).;chromosome=10;comment1=Bio::Annotation::Comment%3DHASH(0x8a564e4);date=22-JUN-2006;mol_type=genomic
DNA;organism=Rattus norvegicus;strain=BN/SsNHsdMCW
NW_047331       GenBank gap     2014    3020    .       +       .
 ID=GenBank:gap:NW_047331:2014:3020;estimated_length=1007
NW_047331       GenBank gap     5534    5583    .       +       .
 ID=GenBank:gap:NW_047331:5534:5583;estimated_length=50
NW_047331       GenBank gene    49      13315   .       +       .
 ID=Bfar;Dbxref=GeneID:304709,RGD:1304791;Note=Derived by automated
computational analysis using gene prediction method: BestRefseq.
Supporting evidence includes similarity to: 1 mRNA;gene=Bfar
NW_047331       GenBank mRNA    49      13315   .       +       .
 ID=Bfar.t01;Parent=Bfar;Dbxref=GI:61557020,GeneID:304709,RGD:1304791;Note=Derived
by automated computational analysis using gene prediction method:
BestRefseq. Supporting evidence includes similarity to: 1
mRNA;exception=unclassified transcription
discrepancy;gene=Bfar;product=bifunctional apoptosis
regulator;transcript_id=NM_001013125.1
NW_047331       GenBank CDS     49      216     .       +       .
 ID=Bfar.p01;Parent=Bfar.t01;Dbxref=GI:61557021,GeneID:304709,RGD:1304791;gO_component=integral
to plasma membrane%3B membrane fraction%3B ubiquitin ligase
complex;gO_function=structural molecule activity%3B ubiquitin-protein
ligase activity%3B zinc ion binding;gO_process=anti-apoptosis%3B
protein ubiquitination;codon_start=1;exception=unclassified
translation discrepancy;gene=Bfar;product=bifunctional apoptosis
regulator (predicted);protein_id=NP_001013143.1

On Thu, Sep 18, 2008 at 5:37 PM, Scott Cain <cain.cshl at gmail.com> wrote:
> Hi Burcu,
>
> I finally got around to looking at the sample data and config file you
> sent me a few days ago.  There were a few problems I had to fix:
>
> 1. GFF3 and the Bio::DB::GFF adaptor doesn't always get along, and I
> think your GFF3 is one example of that happening.  I switched to the
> Bio::DB::SeqFeature::Store adaptor.
>
> 2. With the switch to SeqFeature::Store, you don't need aggregators
> any more, so I switched the [EntrezGene] track to use gene:GenBank
> features and the gene glyph.
>
> 3. I changed all occurrences of 'iD' to 'ID' (I'm off to BioPerl next
> to see what caused this so I can make it stop).
>
> 4. I added a reference seqeunce line; you had a line like this:
>
> Chr10  GenBank region  1   1994762 .    .    .   ID=NW_047331
>
> I changed it to this:
>
> Chr10  GenBank region  1   1994762 .    .    .   ID=Chr10;Name=Chr10
>
> (of course, I suspect that rat chromosome 10 is bigger than that; the
> alternative would be to change column 1 to NW_047331 throughout the
> file.)
>
> Switching to SeqFeature::Store will require a few changes, but not too
> many; basically, the only things that would be affected are the tracks
> that currently use aggregators.  Please let me know if you need any
> help with the transition.
>
> Scott
>
>
> On Mon, Sep 15, 2008 at 4:29 PM, Don Gilbert
> <gilbertd at cricket.bio.indiana.edu> wrote:
>>
>>
>> Burcu,
>>
>> It may be you have to work thru a few changes.  The 'iD' problem likely was
>> part of it, your aggregator also needs to be updated with corrections for ID/Parent
>> tags.
>>
>>>> EntrezGene{CDS,exon/mRNA}
>>
>> This one should work when CDS,exon have Parent=mRNA.ID and mRNA has ID=
>> This is equivalent to the processed_transcript aggregator
>> Bio/DB/GFF/Aggregator/processed_transcript.pm
>>
>> Aggregators are good when using Bio/DB/GFF databases; the Bio/DB/SeqFeature/Store
>> databases do not use aggregators.
>>
>> PS, one of the tools, likely bp_genbank2gff3, created those funky 'iD' tags,
>> for reasons of its own.
>>
>> - Don Gilbert
>>
>
>
>
> --
> ------------------------------------------------------------------------
> Scott Cain, Ph. D. cain.cshl at gmail.com
> GMOD Coordinator (http://gmod.org/) 216-392-3087
> Cold Spring Harbor Laboratory
>

-- 
------------------------------------------------------------------------
Scott Cain, Ph. D. cain.cshl at gmail.com
GMOD Coordinator (http://gmod.org/) 216-392-3087
Cold Spring Harbor Laboratory