[Gmod-help] "Table 'meta' doesn't exist" error from bp_seqfeature_gff3.PLS

Bakir, Burcu BBakir at hmgc.mcw.edu
Wed Sep 24 11:45:52 EDT 2008


Hi Scott,

I successfully downloaded bioperl-live from SVN and made the necessary
settings. Then I ran genbank2gff3.PLS under
bioperl-live/scripts/Bio-DB-GFF as:
./genbank2gff3.PLS -o /rgd_home/3.0/TOOLS/Gbrowse/test/chromosome10/
rn_ref_chr10.gbk.gz

This time I got a gff3 file more similar to what you had. I'll be
pasting few lines of it at the end of this email. Then I tried to load
it to the database using bp_seqfeature_gff3.PLS script under
bioperl-live/scripts/Bio-SeqFeature-Store as: 

-bash-3.00$ ./bp_seqfeature_gff3.PLS -dsn
"dbi:mysql:database=rgd_904_e;host=forte.hmgc.mcw.edu" -user ZZZ -pass
MMM
../../../../3.0/TOOLS/Gbrowse/test/chromosome10/rn_ref_chr10.gbk.gz.gff3

Here is the error I get:

DBD::mysql::st execute failed: Table 'rgd_904_e.meta' doesn't exist at
/rgd_home/bioperl-live/bioperl-live/Bio/DB/SeqFeature/Store/DBI/mysql.pm
line 1217.
-------------------- EXCEPTION --------------------
MSG: Table 'rgd_904_e.meta' doesn't exist
STACK Bio::DB::SeqFeature::Store::DBI::mysql::setting
/rgd_home/bioperl-live/bioperl-live/Bio/DB/SeqFeature/Store/DBI/mysql.pm
:1217
STACK Bio::DB::SeqFeature::Store::serializer
/rgd_home/bioperl-live/bioperl-live/Bio/DB/SeqFeature/Store.pm:1507
STACK Bio::DB::SeqFeature::Store::default_settings
/rgd_home/bioperl-live/bioperl-live/Bio/DB/SeqFeature/Store.pm:2066
STACK Bio::DB::SeqFeature::Store::DBI::mysql::default_settings
/rgd_home/bioperl-live/bioperl-live/Bio/DB/SeqFeature/Store/DBI/mysql.pm
:327
STACK Bio::DB::SeqFeature::Store::DBI::mysql::init
/rgd_home/bioperl-live/bioperl-live/Bio/DB/SeqFeature/Store/DBI/mysql.pm
:219
STACK Bio::DB::SeqFeature::Store::new
/rgd_home/bioperl-live/bioperl-live/Bio/DB/SeqFeature/Store.pm:358
STACK toplevel ./bp_seqfeature_gff3.PLS:45

It complains about non-existing 'rgd_904_e.meta table. My database has
fattribute, fattribute_to_feature, fdata, fdna, fgroup, ftype, fmeta
tables. 

I also have another question: Does running bp_seqfeature_gff3.PLS script
on a database that is already filled with bp_load_gff.pl script cause
any troubles? Because the data for my previous tracks have been loaded
via bp_load_gff.pl script.

Thanks,

Burcu

##gff-version 3
# sequence-region NW_047337 1 1380475
# conversion-by bp_genbank2gff3.pl
# organism Rattus norvegicus
# date 22-JUN-2006
# Note Rattus norvegicus chromosome 10 genomic contig, reference
assembly (based on RGSC v3.4).
Chr10   GenBank chromosome      83193107        84573581        .
+       .       ID=NW_047337;Alias=10;Dbxref=taxon:10116;Note=Rattus n
orvegicus chromosome 10 genomic contig%2C reference assembly (based on
RGSC v3.4).;chromosome=10;comment1=Bio::Annotation::Comment%3DHASH(0x94
e94c);date=22-JUN-2006;mol_type=genomic DNA;organism=Rattus
norvegicus;strain=BN/SsNHsdMCW
Chr10   GenBank STS     83194370        83194502        .       +
.       ID=GenBank:STS:NW_047337:1264:1396;Dbxref=UniSTS:250658;standa
rd_name=AI010027
Chr10   GenBank gene    83194326        83238284        .       -
.       ID=LOC619561;Dbxref=GeneID:619561,RGD:1562656;Note=Derived by 
automated computational analysis using gene prediction method:
BestRefseq. Supporting evidence includes similarity to: 1
mRNA;gene=LOC619561
Chr10   GenBank mRNA    83194326        83238284        .       -
.       ID=LOC619561.t01;Parent=LOC619561;Dbxref=GI:77993367,GeneID:61
9561,RGD:1562656;Note=Derived by automated computational analysis using
gene prediction method: BestRefseq. Supporting evidence includes simil
arity to: 1 mRNA;exception=unclassified transcription
discrepancy;gene=LOC619561;product=hypothetical protein
LOC619561;transcript_id=NM_00103
4951.1
Chr10   GenBank CDS     83195432        83195482        .       -
.       ID=LOC619561.p01;Parent=LOC619561.t01;Dbxref=GI:77993368,GeneI
D:619561,RGD:1562656;codon_start=1;gene=LOC619561;product=hypothetical
protein LOC619561;protein_id=NP_001030123.1


-----Original Message-----
From: Scott Cain [mailto:cain.cshl at gmail.com] 
Sent: Monday, September 22, 2008 12:14 PM
To: Bakir, Burcu
Subject: Re: [Gmod-gbrowse] multi-segmented feature looks fine at low
power, but connecting lines between segments disappear when zoomed in

Hi Burcu,

Certainly you can install bioperl-live in a local folder; I'm
reasonably sure that the bioperl website has directions for do that.
The 1.69 release of GBrowse requires it though, so if you are planning
on installing the new GBrowse, you'll want to address that as well (it
is possible to have a bioperl in place just for GBrowse to use, but it
requires a little forethougth and planning).

Scott


On Mon, Sep 22, 2008 at 1:11 PM, Bakir, Burcu <BBakir at hmgc.mcw.edu>
wrote:
> Hi Scott,
>
> Or how about installing bioperl-live to a local folder (not as root)
and
> make sure it gets first in @INC variable. Hence I can keep other
> previously installed BioPerls. I don't know what other programs using
> the other previously installed BioPerls. I don't want to get rid of
them
> suddenly.
>
> Thanks,
>
> Burcu
>
> -----Original Message-----
> From: Scott Cain [mailto:cain.cshl at gmail.com]
> Sent: Monday, September 22, 2008 11:19 AM
> To: Bakir, Burcu
> Cc: Don Gilbert; help at gmod.org; gmod-gbrowse at lists.sourceforge.net
> Subject: Re: [Gmod-gbrowse] multi-segmented feature looks fine at low
> power, but connecting lines between segments disappear when zoomed in
>
> Hi Burcu,
>
> What version of BioPerl are you using?  If you think you are using
> bioperl-live, is it possible that there is more than one BioPerl
> installed?
>
> Scott
>
> On Mon, Sep 22, 2008 at 12:08 PM, Bakir, Burcu <BBakir at hmgc.mcw.edu>
> wrote:
>> Hi Scott,
>>
>> Thanks for your explanations. I think still something wrong with my
> gff3
>> file. I ran the GenBank record for rn_ref_chr1.gbk file through the
>> BioPerl bp_genbank2gff3.pl script as following:
>>
>> bp_genbank2gff3.pl -y rn_ref_chr1.gbk
>>
>> where --split   -y  option is documented to split output to seperate
> GFF
>> and fasta files for each genbank record
>>
>> My gff3 file differs than what you pasted here in the following
terms:
>> Your first uncommented line is for chromosome, mine is region and
then
>> next is contig. Your features have "ID", whereas mine have "iD". Your
>> CDS also has "ID", whereas mine has no ID or iD but has Parent. I
> don't
>> know why those are different than your gff3 output. Did you run the
>> script just for NW_047331? Is this the difference that I'm running
the
>> script for the whole chromosome(rn_ref_chr1.gbk)?
>>
>> Using rn_ref_chr1.gbk should be fine. The bp_genbank2gff3.pl
>> documentation states as following:
>> The input files are assumed to be gzipped GenBank flatfiles for
refseq
>> contigs.  The files may contain multiple GenBank records.  Either a
>> single file or an entire directory can be processed.  By default, the
>> DNA sequence is embedded in the GFF but it can be saved into seperate
>> fasta file with the --split(-y) option.
>>
>> Thanks,
>>
>> Burcu
>>
>> I'm pasting here first few lines of my NW_047331.gff3 file.
>>
>> ##gff-version 3
>> ##sequence-region NW_047331 1 1994762
>> ##source bp_genbank2gff3.pl
>> NW_047331       GenBank region  1       1994762 .       .       .
>> ID=NW_047331
>> NW_047331       GenBank contig  1       1994762 .       +       .
>> iD=GenBank:contig:NW_047331:1:1994762;mol_type=genomic
>>
>
DNA;db_xref=taxon:10116;strain=BN/SsNHsdMCW;chromosome=10;organism=Rattu
>> s norvegicus
>> NW_047331       GenBank gap     2014    3020    .       +       .
>> iD=GenBank:gap:NW_047331:2014:3020;estimated_length=1007
>> NW_047331       GenBank gene    49      13315   .       +       .
>> iD=Bfar;db_xref=GeneID:304709,RGD:1304791;gene=Bfar;note=Derived by
>> automated computational analysis using gene prediction method:
>> BestRefseq. Supporting evidence includes similarity to: 1 mRNA
>> NW_047331       GenBank mRNA    49      13315   .       +       .
>> iD=Bfar.t01;Parent=Bfar;gene=Bfar;note=Derived by automated
>> computational analysis using gene prediction method: BestRefseq.
>> Supporting evidence includes similarity to: 1
>>
>
mRNA;db_xref=GI:61557020,GeneID:304709,RGD:1304791;exception=unclassifie
>> d transcription discrepancy;product=bifunctional apoptosis
>> regulator;transcript_id=NM_001013125.1
>> NW_047331       GenBank CDS     49      216     .       +       .
>> Parent=Bfar.t01;go_function=zinc ion binding [goid 0008270] [evidence
>> IEA],structural molecule activity [goid 0005198] [evidence
>> IEA],ubiquitin-protein ligase activity [goid 0004842] [evidence
>> IEA];protein_id=NP_001013143.1;gene=Bfar;go_process=anti-apoptosis
> [goid
>> 0006916] [evidence IEA],protein ubiquitination [goid 0016567]
> [evidence
>>
>
IEA];db_xref=GI:61557021,GeneID:304709,RGD:1304791;go_component=membrane
>> fraction [goid 0005624] [evidence IEA],ubiquitin ligase complex [goid
>> 0000151] [evidence IEA],integral to plasma membrane [goid 0005887]
>> [evidence IEA];codon_start=1;exception=unclassified translation
>> discrepancy;product=bifunctional apoptosis regulator (predicted)
>> NW_047331       GenBank exon    49      216     .       +       .
>> Parent=Bfar.t01;gene=Bfar
>>
>>
>> -----Original Message-----
>> From: Scott Cain [mailto:cain.cshl at gmail.com]
>> Sent: Thursday, September 18, 2008 4:54 PM
>> To: Don Gilbert
>> Cc: Bakir, Burcu; help at gmod.org; gmod-gbrowse at lists.sourceforge.net
>> Subject: Re: [Gmod-gbrowse] multi-segmented feature looks fine at low
>> power, but connecting lines between segments disappear when zoomed in
>>
>> Hi Burcu,
>>
>> I just ran the GenBank record for NW_047331 through the BioPerl
>> bp_genbank2gff3.pl script and got perfectly acceptable GFF3 (I'll
>> paste a few lines below); did you get your GFF3 from running a
BioPerl
>> script or somewhere else?
>>
>> Thanks,
>> Scott
>>
>> Here's what the first several lines of the GFF3 looked like:
>> ##gff-version 3
>> # sequence-region NW_047331 1 1994762
>> # conversion-by bp_genbank2gff3.pl
>> # organism Rattus norvegicus
>> # date 22-JUN-2006
>> # Note Rattus norvegicus chromosome 10 genomic contig, reference
>> assembly (based on RGSC v3.4).
>> NW_047331       GenBank chromosome      1       1994762 .       +
>>  .       ID=NW_047331;Alias=10;Dbxref=taxon:10116;Note=Rattus
>> norvegicus chromosome 10 genomic contig%2C reference assembly (based
>> on RGSC
>>
>
v3.4).;chromosome=10;comment1=Bio::Annotation::Comment%3DHASH(0x8a564e4)
>> ;date=22-JUN-2006;mol_type=genomic
>> DNA;organism=Rattus norvegicus;strain=BN/SsNHsdMCW
>> NW_047331       GenBank gap     2014    3020    .       +       .
>>  ID=GenBank:gap:NW_047331:2014:3020;estimated_length=1007
>> NW_047331       GenBank gap     5534    5583    .       +       .
>>  ID=GenBank:gap:NW_047331:5534:5583;estimated_length=50
>> NW_047331       GenBank gene    49      13315   .       +       .
>>  ID=Bfar;Dbxref=GeneID:304709,RGD:1304791;Note=Derived by automated
>> computational analysis using gene prediction method: BestRefseq.
>> Supporting evidence includes similarity to: 1 mRNA;gene=Bfar
>> NW_047331       GenBank mRNA    49      13315   .       +       .
>>
>>
>
ID=Bfar.t01;Parent=Bfar;Dbxref=GI:61557020,GeneID:304709,RGD:1304791;Not
>> e=Derived
>> by automated computational analysis using gene prediction method:
>> BestRefseq. Supporting evidence includes similarity to: 1
>> mRNA;exception=unclassified transcription
>> discrepancy;gene=Bfar;product=bifunctional apoptosis
>> regulator;transcript_id=NM_001013125.1
>> NW_047331       GenBank CDS     49      216     .       +       .
>>
>>
>
ID=Bfar.p01;Parent=Bfar.t01;Dbxref=GI:61557021,GeneID:304709,RGD:1304791
>> ;gO_component=integral
>> to plasma membrane%3B membrane fraction%3B ubiquitin ligase
>> complex;gO_function=structural molecule activity%3B ubiquitin-protein
>> ligase activity%3B zinc ion binding;gO_process=anti-apoptosis%3B
>> protein ubiquitination;codon_start=1;exception=unclassified
>> translation discrepancy;gene=Bfar;product=bifunctional apoptosis
>> regulator (predicted);protein_id=NP_001013143.1
>>
>>
>> On Thu, Sep 18, 2008 at 5:37 PM, Scott Cain <cain.cshl at gmail.com>
> wrote:
>>> Hi Burcu,
>>>
>>> I finally got around to looking at the sample data and config file
> you
>>> sent me a few days ago.  There were a few problems I had to fix:
>>>
>>> 1. GFF3 and the Bio::DB::GFF adaptor doesn't always get along, and I
>>> think your GFF3 is one example of that happening.  I switched to the
>>> Bio::DB::SeqFeature::Store adaptor.
>>>
>>> 2. With the switch to SeqFeature::Store, you don't need aggregators
>>> any more, so I switched the [EntrezGene] track to use gene:GenBank
>>> features and the gene glyph.
>>>
>>> 3. I changed all occurrences of 'iD' to 'ID' (I'm off to BioPerl
next
>>> to see what caused this so I can make it stop).
>>>
>>> 4. I added a reference seqeunce line; you had a line like this:
>>>
>>> Chr10  GenBank region  1   1994762 .    .    .   ID=NW_047331
>>>
>>> I changed it to this:
>>>
>>> Chr10  GenBank region  1   1994762 .    .    .   ID=Chr10;Name=Chr10
>>>
>>> (of course, I suspect that rat chromosome 10 is bigger than that;
the
>>> alternative would be to change column 1 to NW_047331 throughout the
>>> file.)
>>>
>>> Switching to SeqFeature::Store will require a few changes, but not
> too
>>> many; basically, the only things that would be affected are the
> tracks
>>> that currently use aggregators.  Please let me know if you need any
>>> help with the transition.
>>>
>>> Scott
>>>
>>>
>>> On Mon, Sep 15, 2008 at 4:29 PM, Don Gilbert
>>> <gilbertd at cricket.bio.indiana.edu> wrote:
>>>>
>>>>
>>>> Burcu,
>>>>
>>>> It may be you have to work thru a few changes.  The 'iD' problem
>> likely was
>>>> part of it, your aggregator also needs to be updated with
> corrections
>> for ID/Parent
>>>> tags.
>>>>
>>>>>> EntrezGene{CDS,exon/mRNA}
>>>>
>>>> This one should work when CDS,exon have Parent=mRNA.ID and mRNA has
>> ID=
>>>> This is equivalent to the processed_transcript aggregator
>>>> Bio/DB/GFF/Aggregator/processed_transcript.pm
>>>>
>>>> Aggregators are good when using Bio/DB/GFF databases; the
>> Bio/DB/SeqFeature/Store
>>>> databases do not use aggregators.
>>>>
>>>> PS, one of the tools, likely bp_genbank2gff3, created those funky
>> 'iD' tags,
>>>> for reasons of its own.
>>>>
>>>> - Don Gilbert
>>>>
>>>
>>>
>>>
>>> --
>>>
>>
>
------------------------------------------------------------------------
>>> Scott Cain, Ph. D. cain.cshl at gmail.com
>>> GMOD Coordinator (http://gmod.org/) 216-392-3087
>>> Cold Spring Harbor Laboratory
>>>
>>
>>
>>
>> --
>>
>
------------------------------------------------------------------------
>> Scott Cain, Ph. D. cain.cshl at gmail.com
>> GMOD Coordinator (http://gmod.org/) 216-392-3087
>> Cold Spring Harbor Laboratory
>>
>
>
>
> --
>
------------------------------------------------------------------------
> Scott Cain, Ph. D. cain.cshl at gmail.com
> GMOD Coordinator (http://gmod.org/) 216-392-3087
> Cold Spring Harbor Laboratory
>



-- 
------------------------------------------------------------------------
Scott Cain, Ph. D. cain.cshl at gmail.com
GMOD Coordinator (http://gmod.org/) 216-392-3087
Cold Spring Harbor Laboratory




More information about the Gmod-help mailing list