[Gmod-help] gmod_bulk_load_gff3.pl

todd.moughamer at syngenta.com todd.moughamer at syngenta.com
Wed Mar 5 10:16:46 EST 2008


Hi Scott,

We are using version 1.68 of Gbrowse.

No errors are produced in the apache logs when I run try to load the
page configured for chado.

Todd

-----Original Message-----
From: Scott Cain [mailto:cain.cshl at gmail.com] 
Sent: Wednesday, March 05, 2008 9:25 AM
To: Moughamer Todd USRE
Cc: help at gmod.org
Subject: RE: [Gmod-help] gmod_bulk_load_gff3.pl

Hi Todd,

By default, the Chado adaptor for GBrowse assumes that there is only one
organism.  If there is more than one organism in the database, there is
a db_args option to specify which organism to use for GBrowse.

What version of GBrowse are you using?  Are there any messages in your
apache error_log file?

Scott

On Tue, 2008-03-04 at 18:27 -0500, todd.moughamer at syngenta.com wrote:
> Scott,
> 
> Thanks I was able to load successfully!!
> 
> I have one remaining issue. I trying to connect Gbrowse and Chado and 
> I set up my configuration and there are no error that I can. However, 
> I can seem to get anything to render in Gbrowse. Nothing comes up on 
> the search. The sample GFF uses the 'chromosome' type and that is 
> specified as the reference in the conf. One thing that I noticed is 
> that there is no place in the conf file to specify which organism you 
> are working with. So I'm wondering if I missing something.
> 
> Todd
> 
> 
> [GENERAL]
> description =  test implementation of chado5
> db_adaptor    = Bio::DB::Das::Chado
> database      = dbi:PgPP:dbname=chadotest;host=localhost;port=5432
> user          = mccall 
> pass          = <pwd> 
> 
> -----Original Message-----
> From: Scott Cain [mailto:cain.cshl at gmail.com]
> Sent: Tuesday, March 04, 2008 10:28 AM
> To: Moughamer Todd USRE
> Cc: help at gmod.org
> Subject: RE: [Gmod-help] gmod_bulk_load_gff3.pl
> 
> Hi Todd,
> 
> The feature property vocabulary is separate.  It is a collection of 
> common annotation terms, like Note, non_canonical_start_codon, problem

> and status.  It can be loaded by running 'make ontologies' and 
> selecting option 4.  After loading it, you should be good to go--at 
> the worst, you'll need to add '--recreate_cache' to the GFF load 
> command line to flush out incomplete information in a loader helping
table.
> 
> Scott
> 
> On Tue, 2008-03-04 at 08:44 -0500, todd.moughamer at syngenta.com wrote:
> > Hi Scott,
> > 
> > Thanks. Is the feature property vocuabulary part of the sequence 
> > ontology or is it separate? As I said we installed only the SO and 
> > Relationship Ontology. If not would you recommend clearing out the 
> > database before re-running make ontologies?
> > 
> > Best,
> > 
> > Todd
> > 
> > -----Original Message-----
> > From: Scott Cain [mailto:cain.cshl at gmail.com]
> > Sent: Friday, February 29, 2008 11:24 PM
> > To: Moughamer Todd USRE
> > Cc: help at gmod.org
> > Subject: RE: [Gmod-help] gmod_bulk_load_gff3.pl
> > 
> > Hi Todd,
> > 
> > I just reproduced the problem you are seeing by not loading the 
> > feature property controlled vocabulary.  For most cvterms, the 
> > loader checks to make sure it is present before writing back to the
database.
> 
> > The GFF annotation 'Note' was being treated as a special case and I 
> > forgot to add the check that it existed.  I've updated the loader to

> > give a useful message and stop when it finds that Note doesn't
exist.
> > 
> > Sorry for the hassle.
> > Scott
> > 
> > 
> > On Fri, 2008-02-29 at 11:50 -0500, todd.moughamer at syngenta.com
wrote:
> > > Hi Scott,
> > > 
> > > I talked to the unix admin who did that part of the install. He 
> > > ran the makes and only had a problem with make ontologies. There 
> > > was a problem with make ontologies...I believe it was a missing 
> > > library and eventually he got it to run with no reported problems.

> > > He installed the relationship and sequence ontologies.
> > > 
> > > Todd
> > > 
> > > -----Original Message-----
> > > From: Scott Cain [mailto:cain.cshl at gmail.com]
> > > Sent: Friday, February 29, 2008 10:10 AM
> > > To: Moughamer Todd USRE
> > > Cc: help at gmod.org
> > > Subject: RE: [Gmod-help] gmod_bulk_load_gff3.pl
> > > 
> > > Hi Todd,
> > > 
> > > When you created the database, did you use the 'make' based 
> > > procedure that is outlined in the install document?  That is, did
> you do this:
> > > 
> > >   make load_schema
> > >   make prepdb
> > >   make ontologies
> > > 
> > > and when you loaded ontologies, did you load all of relation, 
> > > sequence, gene and feature property?  My best guess for what is 
> > > going wrong is that the feature property controlled vocabulary is 
> > > missing or
> > 
> > > that something from the prepdb inserts is missing (though I doubt 
> > > that
> > 
> > > latter is the problem--I don't think you would have made it this 
> > > far
> 
> > > if that were the case).  I'll try loading the SGD GFF file now to 
> > > see if I run into any problems.
> > > 
> > > Scott
> > > 
> > > On Fri, 2008-02-29 at 10:00 -0500, todd.moughamer at syngenta.com
> wrote:
> > > > Scott,
> > > > 
> > > > No resolution yet. Just in case I  ran the gff file through 
> > > > dos2unix
> > 
> > > > and that didn't help. I saw in a posting about a similar error 
> > > > that it
> > > 
> > > > might have something to do with auto-incrementing of IDs. My 
> > > > next step
> > > 
> > > > would be to wipe out the database and reload the ontology 
> > > > dump...unless you have other suggestions.
> > > > 
> > > > Thanks,
> > > > 
> > > > Todd
> > > > 
> > > > -----Original Message-----
> > > > From: Scott Cain [mailto:cain.cshl at gmail.com]
> > > > Sent: Friday, February 29, 2008 9:09 AM
> > > > To: Moughamer Todd USRE
> > > > Cc: help at gmod.org
> > > > Subject: RE: [Gmod-help] gmod_bulk_load_gff3.pl
> > > > 
> > > > Hi Todd,
> > > > 
> > > > I'm trying to catch up on my email after being gone for a few 
> > > > days--did you get this resolved?
> > > > 
> > > > Scott
> > > > 
> > > > On Fri, 2008-02-22 at 13:01 -0500, todd.moughamer at syngenta.com
> > wrote:
> > > > > Scott,
> > > > > 
> > > > > The BioPerl live updated fixed the "Can't locate object method

> > > > > "database" problem (Thanks!). The loading progresses much 
> > > > > further but now errors out with the message below. Here I am 
> > > > > using the first
> > > 
> > > > > 100 lines of the sample yeast GFF file which did not produce 
> > > > > errors in the
> > > > > GFF3 validator:
> > > > > 
> > > > > Preparing data for inserting into the chadotest database (This

> > > > > may
> > 
> > > > > take a while ...) Loading data into feature table ...
> > > > > Loading data into featureloc table ...
> > > > > Skipping feature_relationship table since the load file is
> > empty...
> > > > > Loading data into featureprop table ...
> > > > > DBD::Pg::db pg_endcopy failed: ERROR:  invalid input syntax 
> > > > > for
> > > > integer:
> > > > > ""
> > > > > CONTEXT:  COPY featureprop, line 1, column type_id: ""
> > > > > 
> > > > > ------------- EXCEPTION: Bio::Root::Exception -------------
> > > > > MSG: calling endcopy for featureprop failed:
> > > > > STACK: Error::throw
> > > > > STACK: Bio::Root::Root::throw
> > > > > /usr/lib/perl5/site_perl/5.8.8/Bio/Root/Root.pm:357
> > > > > STACK: Bio::GMOD::DB::Adapter::copy_from_stdin
> > > > > /usr/lib/perl5/site_perl/5.8.8/Bio/GMOD/DB/Adapter.pm:2723
> > > > > STACK: Bio::GMOD::DB::Adapter::load_data
> > > > > /usr/lib/perl5/site_perl/5.8.8/Bio/GMOD/DB/Adapter.pm:2644
> > > > > STACK: /usr/bin/gmod_bulk_load_gff3.pl:912
> > > > > -----------------------------------------------------------
> > > > > Issuing rollback() for database handle being DESTROY'd without

> > > > > explicit disconnect().
> > > > > 
> > > > > Thanks,
> > > > > 
> > > > > Todd
> > > > > 
> > > > > -----Original Message-----
> > > > > From: Scott Cain [mailto:cain.cshl at gmail.com]
> > > > > Sent: Wednesday, February 20, 2008 11:45 AM
> > > > > To: Moughamer Todd USRE
> > > > > Cc: hlapp at duke.edu; help at gmod.org
> > > > > Subject: RE: [Gmod-help] gmod_bulk_load_gff3.pl
> > > > > 
> > > > > Hi Todd,
> > > > > 
> > > > > The ##gff-version error it reported won't be a problem; the 
> > > > > loader
> > 
> > > > > is quite forgiving about that.  The invalid type problems 
> > > > > could potentially be a problem though.  Here's the thing: 
> > > > > Chado uses SO (so yes, you should be using so.obo) for feature

> > > > > types, while
> 
> > > > > the current
> > > > > GFF3 spec requires SOFA (I've been advocating for changing the

> > > > > spec and it probably will change in the near future).
> > > > > 
> > > > > So, if the validator is complaining about those terms because 
> > > > > they
> > 
> > > > > aren't in SOFA, but they are in SO, that's no problem.  But if

> > > > > they aren't in SO either (if, for instance, they've been 
> > > > > obsoleted), then
> > > 
> > > > > you'll have to fix the file.  When I confronted with something

> > > > > like that, I just do a global search and replace on the term 
> > > > > to swap in the
> > > > 
> > > > > nearest term in SO.
> > > > > 
> > > > > Scott
> > > > > 
> > > > > On Wed, 2008-02-20 at 11:34 -0500, todd.moughamer at syngenta.com
> > > wrote:
> > > > > > Hi Scott,
> > > > > > 
> > > > > > We downloaded Chado from CVS. We are in the process of 
> > > > > > installing the live BioPerl.
> > > > > > 
> > > > > > I am using the example yeast GFF3 files from the web site 
> > > > > > (http://www.gmod.org/wiki/index.php/Load_GFF_Into_Chado). I 
> > > > > > ran them
> > > > 
> > > > > > through the validator and sure enough they came back invalid

> > > > > > (both
> > > 
> > > > > > the
> > > > > 
> > > > > > original and sorted forms). Here are some some of the
errors:
> > > > > > 
> > > > > > Line Number  Error/Warning
> > > > > > -----------  -------------
> > > > > > 1            [ERROR]   first line must be ##gff-version 3
> (line:
> > > > SGD)
> > > > > > 350          [ERROR]   invalid type (type: gene_cassette)
> > > > > > 352          [ERROR]   invalid type (type: gene_cassette)
> > > > > > 369          [ERROR]   invalid type (type: gene_cassette)
> > > > > > 1065         [ERROR]   invalid type (type:
> long_terminal_repeat)
> > > > > > 1066         [ERROR]   invalid type (type:
> long_terminal_repeat)
> > > > > > 1072         [ERROR]   invalid type (type:
> > > > transposable_element_gene)
> > > > > > ...
> > > > > > 
> > > > > > I'm also wondering if I should be using the sofa.obo file 
> > > > > > rather
> > 
> > > > > > than the so.obo file I downloaded from sequenceontology.org?
> > > > > > 
> > > > > > Thanks,
> > > > > > 
> > > > > > Todd
> > > > > > 
> > > > > > -----Original Message-----
> > > > > > From: Scott Cain [mailto:cain.cshl at gmail.com]
> > > > > > Sent: Tuesday, February 19, 2008 3:49 PM
> > > > > > To: Hilmar Lapp
> > > > > > Cc: Moughamer Todd USRE; help at gmod.org
> > > > > > Subject: Re: [Gmod-help] gmod_bulk_load_gff3.pl
> > > > > > 
> > > > > > OK, after running some tests, I still think an out of date 
> > > > > > BioPerl
> > > 
> > > > > > is probably at fault.  With a current checkout of both the 
> > > > > > schema and bioperl repositories, I can load GFF3 data with 
> > > > > > Dbxref tags successfully.
> > > > > > 
> > > > > > Todd, a few questions for you:
> > > > > > 
> > > > > > * are you using a cvs checkout of chado as well?  I should 
> > > > > > have asked that before telling you to update bioperl.
> > > > > > 
> > > > > > * If you are using current checkouts of chado and bioperl 
> > > > > > and still have the problem, could you please run your GFF3 
> > > > > > through
> 
> > > > > > the
> > > 
> > > > > > GFF3 validator to see if it turns up any problems:
> > > > > > 
> > > > > >   
> > > > > > http://dev.wormbase.org/db/validate_gff3/validate_gff3_onlin
> > > > > > e
> > > > > > 
> > > > > > * If you still get the same error message, could you please 
> > > > > > send
> > 
> > > > > > me a sample of the offending GFF3?  There may be a case that

> > > > > > I
> 
> > > > > > didn't think
> > > > > 
> > > > > > of.
> > > > > > 
> > > > > > Thanks,
> > > > > > Scott
> > > > > > 
> > > > > > On Tue, 2008-02-19 at 14:13 -0500, Scott Cain wrote:
> > > > > > > Hi Hilmar,
> > > > > > > 
> > > > > > > Hah!  I was getting ready to write a response where I 
> > > > > > > basically said
> > > > > 
> > > > > > > I
> > > > > > 
> > > > > > > didn't think that was what was going on, and I would have 
> > > > > > > justified it
> > > > > > 
> > > > > > > with some hand waving, since I didn't have the actual 
> > > > > > > error message,
> > > > > 
> > > > > > > so I was just guessing and hopefully, updating bioperl 
> > > > > > > will fix the problem (and that still is a possibility).
> > > > > > > 
> > > > > > > However, I do have the actual error message, so I went and

> > > > > > > looked the offending line, and it is in a method called 
> > > > > > > 'handle_dbxref', so
> > > > > 
> > > > > > > it looks like your diagnosis is spot on.  Now I need to 
> > > > > > > figure
> > 
> > > > > > > out
> > > > 
> > > > > > > if this is still happening with bioperl-live and figure 
> > > > > > > out
> > why.
> > > > > > > I've got a
> > > > > > >
> > Bio::SeqFeature::Annotated->annotation->get_Annotations('Dbxref'
> > > > > > > ),
> > > > > > which I think should return a list of DBLink features.  I 
> > > > > > guess I'll
> > > > 
> > > > > > go see.
> > > > > > > 
> > > > > > > Thanks for pointing that out!
> > > > > > > Scott
> > > > > > > 
> > > > > > > On Tue, 2008-02-19 at 13:55 -0500, Hilmar Lapp wrote:
> > > > > > > > Well, B::A::SimpleValue never had a method called
> > database(). 
> > > > > > > > It
> > > > 
> > > > > > > > is B::A::DBLink that has that (and always had).
> > > > > > > > 
> > > > > > > > So my first diagnosis from afar would be that something 
> > > > > > > > is
> 
> > > > > > > > returning
> > > > > > 
> > > > > > > > or creating a B::A::SimpleValue when it was expected to 
> > > > > > > > return
> > > 
> > > > > > > > or create a B::A::DBLink.
> > > > > > > > 
> > > > > > > > 	-hilmar
> > > > > > > > 
> > > > > > > > On Feb 19, 2008, at 1:07 PM, Scott Cain wrote:
> > > > > > > > 
> > > > > > > > > Hi Todd,
> > > > > > > > >
> > > > > > > > > Yes, that is still the most likely solution for you.  
> > > > > > > > > A few months
> > > > > > 
> > > > > > > > > ago, the BioPerl API changed and 
> > > > > > > > > Bio::Annotation::SimpleValue objects don't work the 
> > > > > > > > > same
> 
> > > > > > > > > way
> > > 
> > > > > > > > > that they used to, thus the error you are seeing.
> > > > > > > > > It's not really looking for a method named 'database';

> > > > > > > > > that is
> > > > 
> > > > > > > > > an artifact left over from the API change.
> > > > > > > > >
> > > > > > > > > Scott
> > > > > > > > >
> > > > > > > > > On Tue, 2008-02-19 at 11:59 -0500, 
> > > > > > > > > todd.moughamer at syngenta.com
> > > > > > wrote:
> > > > > > > > >> I run into the following error when trying to load
> Chado:
> > > > > > > > >>
> > > > > > > > >> gmod_bulk_load_gff3.pl --organism yeast  --gfffile 
> > > > > > > > >> ~/tmp/saccharomyces_cerevisiae.gff.sorted --dbname 
> > > > > > > > >> chadotest Preparing data for inserting into the 
> > > > > > > > >> chadotest
> > 
> > > > > > > > >> database (This may take a while ...) Can't locate 
> > > > > > > > >> object method "database" via
> > > > > 
> > > > > > > > >> package "Bio::Annotation::SimpleValue"
> > > > > > > > >> at
> > > > > > > > >> /usr/lib/perl5/site_perl/5.8.8/Bio/GMOD/DB/Adapter.pm
> > > > > > > > >> line
> > > > 
> > > > > > > > >> 3061, <GEN0> line 1.
> > > > > > > > >> Issuing rollback() for database handle being 
> > > > > > > > >> DESTROY'd without explicit disconnect().
> > > > > > > > >>
> > > > > > > > >> I found reference to this problem online 
> > > > > > > > >> (http://www.nabble.com/question-about-gmod_bulk_load_
> > > > > > > > >> gf
> > > > > > > > >> f3
> > > > > > > > >> .p
> > > > > > > > >> l-
> > > > > > > > >> td15135949.html) with the recommendation of 
> > > > > > > > >> downloading
> 
> > > > > > > > >> the
> > > > > > 'live'  
> > > > > > > > >> version of BioPerl. However, upon browsing the latest

> > > > > > > > >> code in
> > > > 
> > > > > > > > >> SVN
> > > > > > 
> > > > > > > > >> I do not see inclusion of a "database" method. Is 
> > > > > > > > >> this still the recommended solution to the problem?
> > > > > > > > >>
> > > > > > > > >> Thanks,
> > > > > > > > >>
> > > > > > > > >> Todd
> > > > > > > > >>
> > > > > > > > >>
> > > > > > > > >> Todd Moughamer
> > > > > > > > >>
> > > > > > > > >> Bioinformatics Consultancy & Training Group
> > > > > > > > >>
> > > > > > > > >> Syngenta Biotechnology, Inc.
> > > > > > > > >>
> > > > > > > > >> 3054 Cornwallis Road, 1243.E
> > > > > > > > >>
> > > > > > > > >> Research Triangle Park, NC 27709-2257
> > > > > > > > >>
> > > > > > > > >> Tel: 919-597-3078
> > > > > > > > >>
> > > > > > > > >> Email: todd.moughamer at syngenta.com www.syngenta.com
> > > > > > > > >>
> > > > > > > > >>
> > > > > > > > > --
> > > > > > > > >
> > > > > > ------------------------------------------------------------
> > > > > > --
> > > > > > --
> > > > > > --
> > > > > > --
> > > > > > --
> > > > > > > > > --
> > > > > > > > > Scott Cain, Ph. D.
> > 
> > > > > > > > > cain at cshl.edu
> > > > > > > > > GMOD Coordinator (http://www.gmod.org/)
> > > 
> > > > > > > > > 216-392-3087
> > > > > > > > > Cold Spring Harbor Laboratory
> > > > > > > > >
> > > > > > > > 
> > > > > --
> > > > >
> > > > ----------------------------------------------------------------
> > > > --
> > > > --
> > > > --
> > > > --
> > > > > Scott Cain, Ph. D.
> > > > cain.cshl at gmail.com
> > > > > GMOD Coordinator (http://www.gmod.org/)
> > > > 216-392-3087
> > > > > Cold Spring Harbor Laboratory
> > > > > 
> > > > > 
> > > > --
> > > >
> > > ------------------------------------------------------------------
> > > --
> > > --
> > > --
> > > > Scott Cain, Ph. D.
> > > cain.cshl at gmail.com
> > > > GMOD Coordinator (http://www.gmod.org/)
> > > 216-392-3087
> > > > Cold Spring Harbor Laboratory
> > > > 
> > > > 
> > > --
> > >
> > --------------------------------------------------------------------
> > --
> > --
> > > Scott Cain, Ph. D.
> > cain.cshl at gmail.com
> > > GMOD Coordinator (http://www.gmod.org/)
> > 216-392-3087
> > > Cold Spring Harbor Laboratory
> > > 
> > > 
> > > 
> > --
> >
> ----------------------------------------------------------------------
> --
> > Scott Cain, Ph. D.
> cain.cshl at gmail.com
> > GMOD Coordinator (http://www.gmod.org/)
> 216-392-3087
> > Cold Spring Harbor Laboratory
> > 
> > 
> > 
> --
>
------------------------------------------------------------------------
> Scott Cain, Ph. D.
cain.cshl at gmail.com
> GMOD Coordinator (http://www.gmod.org/)
216-392-3087
> Cold Spring Harbor Laboratory
> 
> 
> 
--
------------------------------------------------------------------------
Scott Cain, Ph. D.                                   cain.cshl at gmail.com
GMOD Coordinator (http://www.gmod.org/)                     216-392-3087
Cold Spring Harbor Laboratory






More information about the Gmod-help mailing list