[Gmod-help] Re: [Gmod-schema] Error using gmod_bulk_load_gff3.pl with a ##sequence-region directive
Scott Cain
scott at scottcain.net
Fri Jul 23 13:52:39 EDT 2010
This is in fact a current bug; the easiest work around is to get rid
of sequence-region directives. Actually fixing the bug is a little
trickier since it is due to the fact the Chado and BioPerl have
different ideas of what should happen. While I could (probably)
modify BioPerl to do the right thing (from my perspective), I am
reluctant to do that at the moment since that section of BioPerl is
slated to be refactored.
Scott
On Tue, Jul 20, 2010 at 6:55 PM, Dave Clements, GMOD Help Desk
<help at gmod.org> wrote:
> Hi Jonathan,
> I've created a bug report on this:
> http://sourceforge.net/tracker/?func=detail&aid=3032325&group_id=27707&atid=391291
> This is interesting because the code says:
> This script does not use sequence-region directives for anything.
> If it represents a feature that needs to be inserted into the database,
> it should be represented with a full GFF line.
> Dave C.
> On Fri, Jul 16, 2010 at 1:31 PM, Jonathan Leto <jaleto at gmail.com> wrote:
>>
>> Howdy,
>>
>> I have been attempting to load the ITAG GFF3 [0] files, which contain
>> ##sequence-region directives, but I run into errors like this:
>>
>> $ ./gmod_bulk_load_gff3.pl --gfffile
>> ~/git/ITAG1_release/ITAG1_gene_models_sample.gff3 --organism tomato
>> --noexon --recreate_cache --analysis --remove_lock --save_tmpfiles
>> (Re)creating the uniquename cache in the database...
>> Creating table...
>> Populating table...
>> Creating indexes...
>> Adjusting the primary key sequences (if necessary)...Done.
>>
>> --------------------- WARNING ---------------------
>> MSG: '##feature-ontology' directive handling not yet implemented
>> ---------------------------------------------------
>> Preparing data for inserting into the cxgn database
>> (This may take a while ...)
>> Loading data into feature table ...
>> COPY feature
>> (feature_id,organism_id,name,uniquename,type_id,is_analysis,seqlen,dbxref_id)
>> FROM STDIN; at /home/leto/local-lib/lib/perl5/Bio/GMOD/DB/Adapter.pm
>> line 3210.
>> Loading data into featureloc table ...
>> COPY featureloc
>>
>> (featureloc_id,feature_id,srcfeature_id,fmin,fmax,strand,phase,rank,locgroup)
>> FROM STDIN; at /home/leto/local-lib/lib/perl5/Bio/GMOD/DB/Adapter.pm
>> line 3210.
>> DBD::Pg::db pg_endcopy failed: ERROR: invalid input syntax for integer:
>> ""
>> CONTEXT: COPY featureloc, line 1, column strand: "" at
>> /home/leto/local-lib/lib/perl5/Bio/GMOD/DB/Adapter.pm line 3222, <$fh>
>> line 3.
>>
>> ------------- EXCEPTION: Bio::Root::Exception -------------
>> MSG: calling endcopy for featureloc failed:
>> STACK: Error::throw
>> STACK: Bio::Root::Root::throw
>> /home/leto/local-lib/lib/perl5/Bio/Root/Root.pm:368
>> STACK: Bio::GMOD::DB::Adapter::copy_from_stdin
>> /home/leto/local-lib/lib/perl5/Bio/GMOD/DB/Adapter.pm:3222
>> STACK: Bio::GMOD::DB::Adapter::load_data
>> /home/leto/local-lib/lib/perl5/Bio/GMOD/DB/Adapter.pm:3144
>> STACK: ./gmod_bulk_load_gff3.pl:1060
>> -----------------------------------------------------------
>>
>> The salient information is that somehow a strand of "" is attempting
>> to be inserted into the database, which fails. Note that I have also
>> uncommented
>> a warning statement that shows the SQL query that is being executed.
>>
>> I have traced this issue to be caused by the sequence-region
>> directive. When I remove the line, the file loads fine. As another
>> test, I created a file with nothing but a sequence-region directive,
>> and the same error occurs. I have attached that file and the temp
>> data file that gmod_bulk_load_gff3.pl creates as well. The 6th column
>> of that file is the strand, and it has a value of "\N, which is the
>> text representation of NULL.
>>
>> It seems to me that something is stringifying the NULL into "" and
>> then attempting to insert the empty string into strand, which has a
>> type of smallint. This is what causes the failure.
>>
>> I would greatly appreciate any thoughts or comments on how to make the
>> bulk loading script support the sequence-region directive.
>>
>> Thanks
>>
>> [0] ftp://ftp.solgenomics.net/tomato_genome/annotation/ITAG1_release/
>>
>> --
>> Jonathan "Duke" Leto
>> jonathan at leto.net
>> http://leto.net
>>
>>
>> ------------------------------------------------------------------------------
>> This SF.net email is sponsored by Sprint
>> What will you do first with EVO, the first 4G phone?
>> Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first
>> _______________________________________________
>> Gmod-schema mailing list
>> Gmod-schema at lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/gmod-schema
>>
>
>
>
> --
> ===> PLEASE KEEP RESPONSES ON THE LIST <===
> http://gmod.org/wiki/GMOD_News
> http://gmod.org/wiki/Calendar
> http://gmod.org/wiki/Help_Desk_Feedback
>
>
> ------------------------------------------------------------------------------
> This SF.net email is sponsored by Sprint
> What will you do first with EVO, the first 4G phone?
> Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first
> _______________________________________________
> Gmod-schema mailing list
> Gmod-schema at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/gmod-schema
>
>
--
------------------------------------------------------------------------
Scott Cain, Ph. D. scott at scottcain dot net
GMOD Coordinator (http://gmod.org/) 216-392-3087
Ontario Institute for Cancer Research
More information about the Gmod-help
mailing list