[Gmod-help] Error with gmod_bulk_load_gff3.pl
Christian M. Probst
cmacprobst at gmail.com
Thu Jul 17 17:26:24 EDT 2008
Hi,
I am trying to upload my organism data to CHADO and I am stuck in a error.
I have downloaded a GenBank formatted file, used the suggested
transformation to GFF3:
bp_genbank2gff3.pl -noCDS -s -o . temp.txt
After, I have used this sintax for gmod_bulk_load:
gmod_bulk_load_gff3.pl --dbname XXX --dbxref GeneID --organism XXX
--gff temp.gff
Preparing data for inserting into the CruziGeneDB database
(This may take a while ...)
no parent Tc00.1047053508153.20;
you probably need to rerun the loader with the --recreate_cache option
Well, the Tc00.1047053508153.20 ID is in the GFF file and is before
the entry that references it as Parent.
I have followed the suggestion, and ran the same command line above,
but including --recreate_cache.
The script runs for a long time and then the following error appears.
DBD::Pg::db pg_endcopy failed: ERROR: invalid input syntax for integer: ""
CONTEXT: COPY feature_relationship, line 1, column type_id: "" at
/opt/coolstack/lib/perl5/site_perl/5.8.8/Bio/GMOD/DB/Adapter.pm line
2723, <$fh> line 64298.
Then I tried to run gmod_bulk_load with --noload --inserts --save_tmpfiles.
When inspecting the chado-feature_relationshipXXX file, I have found
that Features having a Parent= delimiter in the GFF file have a empty
field in the INSERT statement for type_id. As an example:
INSERT INTO feature_relationship
(feature_relationship_id,subject_id,object_id,type_id) VALUES
(15,190738,190737,);
INSERT INTO feature_relationship
(feature_relationship_id,subject_id,object_id,type_id) VALUES
(16,190739,190738,53);
The first line is from a feature containing a Parent delimiter. It has
a empty value for type_id
The second line is from a feature containing a derived_from delimiter.
It has the correct cvterm_id for type_id.
So, the OBO relationship of the Parent Delimiter is not being
correctly identified.
I have tried to found 'part_of' in the cvterm table, and found only
entries related to the cv 'Gene Ontology' and 'Plant Ontology'.
The 'derives_from' term, in the cvterm table, is mapped to the
'relationship' cv, but I have no 'part_of' mapped to 'relationship'
cv. Is that a possible source for this error? Anyway, if you could
help me in any
sense, I would be very glad.
Thanks in advance.
Christian M. Probst
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://brie4.cshl.edu/pipermail/gmod-help/attachments/20080717/2bb0bc1d/attachment.html>
More information about the Gmod-help
mailing list