[Gmod-help] gmod_bulk_load_gff3.pl
todd.moughamer at syngenta.com
todd.moughamer at syngenta.com
Thu Mar 6 13:01:07 EST 2008
Scott,
It turned out to be a conf file someone must have copied from another
previous test installation. I just re-copied and modified the
07.chado.conf and we are up and running. Thanks for all the help.
Best,
Todd
-----Original Message-----
From: Scott Cain [mailto:cain.cshl at gmail.com]
Sent: Wednesday, March 05, 2008 10:35 AM
To: Moughamer Todd USRE
Cc: help at gmod.org
Subject: RE: [Gmod-help] gmod_bulk_load_gff3.pl
Hi Todd,
You are working with yeast data right now, right? If so, the conf file
in contrib/conf_files/07.chado.conf should work. Would you mind trying
the development version of GBrowse? You can get it using the gbrowse
net installer described here:
http://www.gmod.org/wiki/index.php/Gbrowse#Net-based_Installer_Script
and supplying the '-d' option to get development versions of bioperl and
gbrowse.
Scott
On Wed, 2008-03-05 at 10:16 -0500, todd.moughamer at syngenta.com wrote:
> Hi Scott,
>
> We are using version 1.68 of Gbrowse.
>
> No errors are produced in the apache logs when I run try to load the
> page configured for chado.
>
> Todd
>
> -----Original Message-----
> From: Scott Cain [mailto:cain.cshl at gmail.com]
> Sent: Wednesday, March 05, 2008 9:25 AM
> To: Moughamer Todd USRE
> Cc: help at gmod.org
> Subject: RE: [Gmod-help] gmod_bulk_load_gff3.pl
>
> Hi Todd,
>
> By default, the Chado adaptor for GBrowse assumes that there is only
> one organism. If there is more than one organism in the database,
> there is a db_args option to specify which organism to use for
GBrowse.
>
> What version of GBrowse are you using? Are there any messages in your
> apache error_log file?
>
> Scott
>
> On Tue, 2008-03-04 at 18:27 -0500, todd.moughamer at syngenta.com wrote:
> > Scott,
> >
> > Thanks I was able to load successfully!!
> >
> > I have one remaining issue. I trying to connect Gbrowse and Chado
> > and I set up my configuration and there are no error that I can.
> > However, I can seem to get anything to render in Gbrowse. Nothing
> > comes up on the search. The sample GFF uses the 'chromosome' type
> > and that is specified as the reference in the conf. One thing that I
> > noticed is that there is no place in the conf file to specify which
> > organism you are working with. So I'm wondering if I missing
something.
> >
> > Todd
> >
> >
> > [GENERAL]
> > description = test implementation of chado5
> > db_adaptor = Bio::DB::Das::Chado
> > database = dbi:PgPP:dbname=chadotest;host=localhost;port=5432
> > user = mccall
> > pass = <pwd>
> >
> > -----Original Message-----
> > From: Scott Cain [mailto:cain.cshl at gmail.com]
> > Sent: Tuesday, March 04, 2008 10:28 AM
> > To: Moughamer Todd USRE
> > Cc: help at gmod.org
> > Subject: RE: [Gmod-help] gmod_bulk_load_gff3.pl
> >
> > Hi Todd,
> >
> > The feature property vocabulary is separate. It is a collection of
> > common annotation terms, like Note, non_canonical_start_codon,
> > problem
>
> > and status. It can be loaded by running 'make ontologies' and
> > selecting option 4. After loading it, you should be good to go--at
> > the worst, you'll need to add '--recreate_cache' to the GFF load
> > command line to flush out incomplete information in a loader helping
> table.
> >
> > Scott
> >
> > On Tue, 2008-03-04 at 08:44 -0500, todd.moughamer at syngenta.com
wrote:
> > > Hi Scott,
> > >
> > > Thanks. Is the feature property vocuabulary part of the sequence
> > > ontology or is it separate? As I said we installed only the SO and
> > > Relationship Ontology. If not would you recommend clearing out the
> > > database before re-running make ontologies?
> > >
> > > Best,
> > >
> > > Todd
> > >
> > > -----Original Message-----
> > > From: Scott Cain [mailto:cain.cshl at gmail.com]
> > > Sent: Friday, February 29, 2008 11:24 PM
> > > To: Moughamer Todd USRE
> > > Cc: help at gmod.org
> > > Subject: RE: [Gmod-help] gmod_bulk_load_gff3.pl
> > >
> > > Hi Todd,
> > >
> > > I just reproduced the problem you are seeing by not loading the
> > > feature property controlled vocabulary. For most cvterms, the
> > > loader checks to make sure it is present before writing back to
> > > the
> database.
> >
> > > The GFF annotation 'Note' was being treated as a special case and
> > > I forgot to add the check that it existed. I've updated the
> > > loader to
>
> > > give a useful message and stop when it finds that Note doesn't
> exist.
> > >
> > > Sorry for the hassle.
> > > Scott
> > >
> > >
> > > On Fri, 2008-02-29 at 11:50 -0500, todd.moughamer at syngenta.com
> wrote:
> > > > Hi Scott,
> > > >
> > > > I talked to the unix admin who did that part of the install. He
> > > > ran the makes and only had a problem with make ontologies. There
> > > > was a problem with make ontologies...I believe it was a missing
> > > > library and eventually he got it to run with no reported
problems.
>
> > > > He installed the relationship and sequence ontologies.
> > > >
> > > > Todd
> > > >
> > > > -----Original Message-----
> > > > From: Scott Cain [mailto:cain.cshl at gmail.com]
> > > > Sent: Friday, February 29, 2008 10:10 AM
> > > > To: Moughamer Todd USRE
> > > > Cc: help at gmod.org
> > > > Subject: RE: [Gmod-help] gmod_bulk_load_gff3.pl
> > > >
> > > > Hi Todd,
> > > >
> > > > When you created the database, did you use the 'make' based
> > > > procedure that is outlined in the install document? That is,
> > > > did
> > you do this:
> > > >
> > > > make load_schema
> > > > make prepdb
> > > > make ontologies
> > > >
> > > > and when you loaded ontologies, did you load all of relation,
> > > > sequence, gene and feature property? My best guess for what is
> > > > going wrong is that the feature property controlled vocabulary
> > > > is missing or
> > >
> > > > that something from the prepdb inserts is missing (though I
> > > > doubt that
> > >
> > > > latter is the problem--I don't think you would have made it this
> > > > far
> >
> > > > if that were the case). I'll try loading the SGD GFF file now
> > > > to see if I run into any problems.
> > > >
> > > > Scott
> > > >
> > > > On Fri, 2008-02-29 at 10:00 -0500, todd.moughamer at syngenta.com
> > wrote:
> > > > > Scott,
> > > > >
> > > > > No resolution yet. Just in case I ran the gff file through
> > > > > dos2unix
> > >
> > > > > and that didn't help. I saw in a posting about a similar error
> > > > > that it
> > > >
> > > > > might have something to do with auto-incrementing of IDs. My
> > > > > next step
> > > >
> > > > > would be to wipe out the database and reload the ontology
> > > > > dump...unless you have other suggestions.
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Todd
> > > > >
> > > > > -----Original Message-----
> > > > > From: Scott Cain [mailto:cain.cshl at gmail.com]
> > > > > Sent: Friday, February 29, 2008 9:09 AM
> > > > > To: Moughamer Todd USRE
> > > > > Cc: help at gmod.org
> > > > > Subject: RE: [Gmod-help] gmod_bulk_load_gff3.pl
> > > > >
> > > > > Hi Todd,
> > > > >
> > > > > I'm trying to catch up on my email after being gone for a few
> > > > > days--did you get this resolved?
> > > > >
> > > > > Scott
> > > > >
> > > > > On Fri, 2008-02-22 at 13:01 -0500, todd.moughamer at syngenta.com
> > > wrote:
> > > > > > Scott,
> > > > > >
> > > > > > The BioPerl live updated fixed the "Can't locate object
> > > > > > method
>
> > > > > > "database" problem (Thanks!). The loading progresses much
> > > > > > further but now errors out with the message below. Here I am
> > > > > > using the first
> > > >
> > > > > > 100 lines of the sample yeast GFF file which did not produce
> > > > > > errors in the
> > > > > > GFF3 validator:
> > > > > >
> > > > > > Preparing data for inserting into the chadotest database
> > > > > > (This
>
> > > > > > may
> > >
> > > > > > take a while ...) Loading data into feature table ...
> > > > > > Loading data into featureloc table ...
> > > > > > Skipping feature_relationship table since the load file is
> > > empty...
> > > > > > Loading data into featureprop table ...
> > > > > > DBD::Pg::db pg_endcopy failed: ERROR: invalid input syntax
> > > > > > for
> > > > > integer:
> > > > > > ""
> > > > > > CONTEXT: COPY featureprop, line 1, column type_id: ""
> > > > > >
> > > > > > ------------- EXCEPTION: Bio::Root::Exception -------------
> > > > > > MSG: calling endcopy for featureprop failed:
> > > > > > STACK: Error::throw
> > > > > > STACK: Bio::Root::Root::throw
> > > > > > /usr/lib/perl5/site_perl/5.8.8/Bio/Root/Root.pm:357
> > > > > > STACK: Bio::GMOD::DB::Adapter::copy_from_stdin
> > > > > > /usr/lib/perl5/site_perl/5.8.8/Bio/GMOD/DB/Adapter.pm:2723
> > > > > > STACK: Bio::GMOD::DB::Adapter::load_data
> > > > > > /usr/lib/perl5/site_perl/5.8.8/Bio/GMOD/DB/Adapter.pm:2644
> > > > > > STACK: /usr/bin/gmod_bulk_load_gff3.pl:912
> > > > > > -----------------------------------------------------------
> > > > > > Issuing rollback() for database handle being DESTROY'd
> > > > > > without
>
> > > > > > explicit disconnect().
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > Todd
> > > > > >
> > > > > > -----Original Message-----
> > > > > > From: Scott Cain [mailto:cain.cshl at gmail.com]
> > > > > > Sent: Wednesday, February 20, 2008 11:45 AM
> > > > > > To: Moughamer Todd USRE
> > > > > > Cc: hlapp at duke.edu; help at gmod.org
> > > > > > Subject: RE: [Gmod-help] gmod_bulk_load_gff3.pl
> > > > > >
> > > > > > Hi Todd,
> > > > > >
> > > > > > The ##gff-version error it reported won't be a problem; the
> > > > > > loader
> > >
> > > > > > is quite forgiving about that. The invalid type problems
> > > > > > could potentially be a problem though. Here's the thing:
> > > > > > Chado uses SO (so yes, you should be using so.obo) for
> > > > > > feature
>
> > > > > > types, while
> >
> > > > > > the current
> > > > > > GFF3 spec requires SOFA (I've been advocating for changing
> > > > > > the
>
> > > > > > spec and it probably will change in the near future).
> > > > > >
> > > > > > So, if the validator is complaining about those terms
> > > > > > because they
> > >
> > > > > > aren't in SOFA, but they are in SO, that's no problem. But
> > > > > > if
>
> > > > > > they aren't in SO either (if, for instance, they've been
> > > > > > obsoleted), then
> > > >
> > > > > > you'll have to fix the file. When I confronted with
> > > > > > something
>
> > > > > > like that, I just do a global search and replace on the term
> > > > > > to swap in the
> > > > >
> > > > > > nearest term in SO.
> > > > > >
> > > > > > Scott
> > > > > >
> > > > > > On Wed, 2008-02-20 at 11:34 -0500,
> > > > > > todd.moughamer at syngenta.com
> > > > wrote:
> > > > > > > Hi Scott,
> > > > > > >
> > > > > > > We downloaded Chado from CVS. We are in the process of
> > > > > > > installing the live BioPerl.
> > > > > > >
> > > > > > > I am using the example yeast GFF3 files from the web site
> > > > > > > (http://www.gmod.org/wiki/index.php/Load_GFF_Into_Chado).
> > > > > > > I ran them
> > > > >
> > > > > > > through the validator and sure enough they came back
> > > > > > > invalid
>
> > > > > > > (both
> > > >
> > > > > > > the
> > > > > >
> > > > > > > original and sorted forms). Here are some some of the
> errors:
> > > > > > >
> > > > > > > Line Number Error/Warning
> > > > > > > ----------- -------------
> > > > > > > 1 [ERROR] first line must be ##gff-version 3
> > (line:
> > > > > SGD)
> > > > > > > 350 [ERROR] invalid type (type: gene_cassette)
> > > > > > > 352 [ERROR] invalid type (type: gene_cassette)
> > > > > > > 369 [ERROR] invalid type (type: gene_cassette)
> > > > > > > 1065 [ERROR] invalid type (type:
> > long_terminal_repeat)
> > > > > > > 1066 [ERROR] invalid type (type:
> > long_terminal_repeat)
> > > > > > > 1072 [ERROR] invalid type (type:
> > > > > transposable_element_gene)
> > > > > > > ...
> > > > > > >
> > > > > > > I'm also wondering if I should be using the sofa.obo file
> > > > > > > rather
> > >
> > > > > > > than the so.obo file I downloaded from
sequenceontology.org?
> > > > > > >
> > > > > > > Thanks,
> > > > > > >
> > > > > > > Todd
> > > > > > >
> > > > > > > -----Original Message-----
> > > > > > > From: Scott Cain [mailto:cain.cshl at gmail.com]
> > > > > > > Sent: Tuesday, February 19, 2008 3:49 PM
> > > > > > > To: Hilmar Lapp
> > > > > > > Cc: Moughamer Todd USRE; help at gmod.org
> > > > > > > Subject: Re: [Gmod-help] gmod_bulk_load_gff3.pl
> > > > > > >
> > > > > > > OK, after running some tests, I still think an out of date
> > > > > > > BioPerl
> > > >
> > > > > > > is probably at fault. With a current checkout of both the
> > > > > > > schema and bioperl repositories, I can load GFF3 data with
> > > > > > > Dbxref tags successfully.
> > > > > > >
> > > > > > > Todd, a few questions for you:
> > > > > > >
> > > > > > > * are you using a cvs checkout of chado as well? I should
> > > > > > > have asked that before telling you to update bioperl.
> > > > > > >
> > > > > > > * If you are using current checkouts of chado and bioperl
> > > > > > > and still have the problem, could you please run your GFF3
> > > > > > > through
> >
> > > > > > > the
> > > >
> > > > > > > GFF3 validator to see if it turns up any problems:
> > > > > > >
> > > > > > >
> > > > > > > http://dev.wormbase.org/db/validate_gff3/validate_gff3_onl
> > > > > > > in
> > > > > > > e
> > > > > > >
> > > > > > > * If you still get the same error message, could you
> > > > > > > please send
> > >
> > > > > > > me a sample of the offending GFF3? There may be a case
> > > > > > > that
>
> > > > > > > I
> >
> > > > > > > didn't think
> > > > > >
> > > > > > > of.
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Scott
> > > > > > >
> > > > > > > On Tue, 2008-02-19 at 14:13 -0500, Scott Cain wrote:
> > > > > > > > Hi Hilmar,
> > > > > > > >
> > > > > > > > Hah! I was getting ready to write a response where I
> > > > > > > > basically said
> > > > > >
> > > > > > > > I
> > > > > > >
> > > > > > > > didn't think that was what was going on, and I would
> > > > > > > > have justified it
> > > > > > >
> > > > > > > > with some hand waving, since I didn't have the actual
> > > > > > > > error message,
> > > > > >
> > > > > > > > so I was just guessing and hopefully, updating bioperl
> > > > > > > > will fix the problem (and that still is a possibility).
> > > > > > > >
> > > > > > > > However, I do have the actual error message, so I went
> > > > > > > > and
>
> > > > > > > > looked the offending line, and it is in a method called
> > > > > > > > 'handle_dbxref', so
> > > > > >
> > > > > > > > it looks like your diagnosis is spot on. Now I need to
> > > > > > > > figure
> > >
> > > > > > > > out
> > > > >
> > > > > > > > if this is still happening with bioperl-live and figure
> > > > > > > > out
> > > why.
> > > > > > > > I've got a
> > > > > > > >
> > > Bio::SeqFeature::Annotated->annotation->get_Annotations('Dbxref'
> > > > > > > > ),
> > > > > > > which I think should return a list of DBLink features. I
> > > > > > > guess I'll
> > > > >
> > > > > > > go see.
> > > > > > > >
> > > > > > > > Thanks for pointing that out!
> > > > > > > > Scott
> > > > > > > >
> > > > > > > > On Tue, 2008-02-19 at 13:55 -0500, Hilmar Lapp wrote:
> > > > > > > > > Well, B::A::SimpleValue never had a method called
> > > database().
> > > > > > > > > It
> > > > >
> > > > > > > > > is B::A::DBLink that has that (and always had).
> > > > > > > > >
> > > > > > > > > So my first diagnosis from afar would be that
> > > > > > > > > something is
> >
> > > > > > > > > returning
> > > > > > >
> > > > > > > > > or creating a B::A::SimpleValue when it was expected
> > > > > > > > > to return
> > > >
> > > > > > > > > or create a B::A::DBLink.
> > > > > > > > >
> > > > > > > > > -hilmar
> > > > > > > > >
> > > > > > > > > On Feb 19, 2008, at 1:07 PM, Scott Cain wrote:
> > > > > > > > >
> > > > > > > > > > Hi Todd,
> > > > > > > > > >
> > > > > > > > > > Yes, that is still the most likely solution for you.
> > > > > > > > > > A few months
> > > > > > >
> > > > > > > > > > ago, the BioPerl API changed and
> > > > > > > > > > Bio::Annotation::SimpleValue objects don't work the
> > > > > > > > > > same
> >
> > > > > > > > > > way
> > > >
> > > > > > > > > > that they used to, thus the error you are seeing.
> > > > > > > > > > It's not really looking for a method named
> > > > > > > > > > 'database';
>
> > > > > > > > > > that is
> > > > >
> > > > > > > > > > an artifact left over from the API change.
> > > > > > > > > >
> > > > > > > > > > Scott
> > > > > > > > > >
> > > > > > > > > > On Tue, 2008-02-19 at 11:59 -0500,
> > > > > > > > > > todd.moughamer at syngenta.com
> > > > > > > wrote:
> > > > > > > > > >> I run into the following error when trying to load
> > Chado:
> > > > > > > > > >>
> > > > > > > > > >> gmod_bulk_load_gff3.pl --organism yeast --gfffile
> > > > > > > > > >> ~/tmp/saccharomyces_cerevisiae.gff.sorted --dbname
> > > > > > > > > >> chadotest Preparing data for inserting into the
> > > > > > > > > >> chadotest
> > >
> > > > > > > > > >> database (This may take a while ...) Can't locate
> > > > > > > > > >> object method "database" via
> > > > > >
> > > > > > > > > >> package "Bio::Annotation::SimpleValue"
> > > > > > > > > >> at
> > > > > > > > > >> /usr/lib/perl5/site_perl/5.8.8/Bio/GMOD/DB/Adapter.
> > > > > > > > > >> pm
> > > > > > > > > >> line
> > > > >
> > > > > > > > > >> 3061, <GEN0> line 1.
> > > > > > > > > >> Issuing rollback() for database handle being
> > > > > > > > > >> DESTROY'd without explicit disconnect().
> > > > > > > > > >>
> > > > > > > > > >> I found reference to this problem online
> > > > > > > > > >> (http://www.nabble.com/question-about-gmod_bulk_loa
> > > > > > > > > >> d_
> > > > > > > > > >> gf
> > > > > > > > > >> f3
> > > > > > > > > >> .p
> > > > > > > > > >> l-
> > > > > > > > > >> td15135949.html) with the recommendation of
> > > > > > > > > >> downloading
> >
> > > > > > > > > >> the
> > > > > > > 'live'
> > > > > > > > > >> version of BioPerl. However, upon browsing the
> > > > > > > > > >> latest
>
> > > > > > > > > >> code in
> > > > >
> > > > > > > > > >> SVN
> > > > > > >
> > > > > > > > > >> I do not see inclusion of a "database" method. Is
> > > > > > > > > >> this still the recommended solution to the problem?
> > > > > > > > > >>
> > > > > > > > > >> Thanks,
> > > > > > > > > >>
> > > > > > > > > >> Todd
> > > > > > > > > >>
> > > > > > > > > >>
> > > > > > > > > >> Todd Moughamer
> > > > > > > > > >>
> > > > > > > > > >> Bioinformatics Consultancy & Training Group
> > > > > > > > > >>
> > > > > > > > > >> Syngenta Biotechnology, Inc.
> > > > > > > > > >>
> > > > > > > > > >> 3054 Cornwallis Road, 1243.E
> > > > > > > > > >>
> > > > > > > > > >> Research Triangle Park, NC 27709-2257
> > > > > > > > > >>
> > > > > > > > > >> Tel: 919-597-3078
> > > > > > > > > >>
> > > > > > > > > >> Email: todd.moughamer at syngenta.com www.syngenta.com
> > > > > > > > > >>
> > > > > > > > > >>
> > > > > > > > > > --
> > > > > > > > > >
> > > > > > > ----------------------------------------------------------
> > > > > > > --
> > > > > > > --
> > > > > > > --
> > > > > > > --
> > > > > > > --
> > > > > > > --
> > > > > > > > > > --
> > > > > > > > > > Scott Cain, Ph. D.
> > >
> > > > > > > > > > cain at cshl.edu
> > > > > > > > > > GMOD Coordinator (http://www.gmod.org/)
> > > >
> > > > > > > > > > 216-392-3087
> > > > > > > > > > Cold Spring Harbor Laboratory
> > > > > > > > > >
> > > > > > > > >
> > > > > > --
> > > > > >
> > > > > --------------------------------------------------------------
> > > > > --
> > > > > --
> > > > > --
> > > > > --
> > > > > --
> > > > > > Scott Cain, Ph. D.
> > > > > cain.cshl at gmail.com
> > > > > > GMOD Coordinator (http://www.gmod.org/)
> > > > > 216-392-3087
> > > > > > Cold Spring Harbor Laboratory
> > > > > >
> > > > > >
> > > > > --
> > > > >
> > > > ----------------------------------------------------------------
> > > > --
> > > > --
> > > > --
> > > > --
> > > > > Scott Cain, Ph. D.
> > > > cain.cshl at gmail.com
> > > > > GMOD Coordinator (http://www.gmod.org/)
> > > > 216-392-3087
> > > > > Cold Spring Harbor Laboratory
> > > > >
> > > > >
> > > > --
> > > >
> > > ------------------------------------------------------------------
> > > --
> > > --
> > > --
> > > > Scott Cain, Ph. D.
> > > cain.cshl at gmail.com
> > > > GMOD Coordinator (http://www.gmod.org/)
> > > 216-392-3087
> > > > Cold Spring Harbor Laboratory
> > > >
> > > >
> > > >
> > > --
> > >
> > --------------------------------------------------------------------
> > --
> > --
> > > Scott Cain, Ph. D.
> > cain.cshl at gmail.com
> > > GMOD Coordinator (http://www.gmod.org/)
> > 216-392-3087
> > > Cold Spring Harbor Laboratory
> > >
> > >
> > >
> > --
> >
> ----------------------------------------------------------------------
> --
> > Scott Cain, Ph. D.
> cain.cshl at gmail.com
> > GMOD Coordinator (http://www.gmod.org/)
> 216-392-3087
> > Cold Spring Harbor Laboratory
> >
> >
> >
> --
>
------------------------------------------------------------------------
> Scott Cain, Ph. D.
cain.cshl at gmail.com
> GMOD Coordinator (http://www.gmod.org/)
216-392-3087
> Cold Spring Harbor Laboratory
>
>
>
--
------------------------------------------------------------------------
Scott Cain, Ph. D. cain.cshl at gmail.com
GMOD Coordinator (http://www.gmod.org/) 216-392-3087
Cold Spring Harbor Laboratory
More information about the Gmod-help
mailing list