[Gmod-help] chado loading problem
Genevieve DeClerck
gad14 at cornell.edu
Tue Apr 8 11:52:36 EDT 2008
Hello,
I have the chado schema installed in a postgres database on an OS X
10.4.11 box. I am following the instructions at the gmod wiki in
"Load GFF Into Chado" and am encountering a problem.
I am trying to load Pseudomonas syringae pv DC3000 data, which is in
refseq (ftp://ftp.ncbi.nih.gov/genomes/Bacteria/
Pseudomonas_syringae_tomato_DC3000). I inserted an entry in table
'organism' an entry for DC3000:
insert into organism (abbreviation, genus, species, common_name,
organism_id) values
('NC_004578','Pseudomonas','syringae','DC3000','223283')
and the data is now in the db:
test=# select * from organism where organism_id='223283';
organism_id | abbreviation | genus | species | common_name |
comment
-------------+--------------+-------------+----------+-------------
+---------
223283 | NC_004578 | Pseudomonas | syringae | DC3000 |
(1 row)
Now, I preprocess the genbank gff with 'gmod_gff3_preprocessor.pl',
which seems to go fine. then try to load the gff with
'gmod_bulk_load_gff3.pl' and I get an error:
$ gmod_gff3_preprocessor.pl --gfffile NC_004578.gff
Sorting the contents of NC_004578.gff ...
Writing sorted contents to NC_004578.gff.sorted ...
$ gmod_bulk_load_gff3.pl --organism DC3000 --gfffile
NC_004578.gff.sorted
Preparing data for inserting into the test database
(This may take a while ...)
There is a CDS feature with no parent (ID:) I think that is wrong!
This GFF file has CDS and/or UTR features that do not belong to a
'central dogma' gene (ie, gene/transcript/CDS). The features of this
type are being stored in the database as is.
------------- EXCEPTION -------------
MSG: no cvterm for CDS
STACK Bio::GMOD::DB::Adapter::get_type /my_packages/gmod/chado/schema/
chado/lib/Bio/GMOD/DB/Adapter.pm:4050
STACK toplevel /sw/bin/gmod_bulk_load_gff3.pl:752
--------------------------------------
Issuing rollback() for database handle being DESTROY'd without
explicit disconnect().
I also tried starting with the gbk file from genbank, but still no
success with loading (the gbk -> gff conversion seems to have gone ok):
$ ../../../bin/gmod_bulk_load_gff3.pl --organism DC3000 -gfffile
NC_004578.gbk.gff
Preparing data for inserting into the test database(This may take a
while ...)
------------- EXCEPTION -------------
MSG: no cvterm for region
STACK Bio::GMOD::DB::Adapter::get_type /my_packages/gmod/chado/schema/
chado/lib/Bio/GMOD/DB/Adapter.pm:4050
STACK toplevel ../../../bin/gmod_bulk_load_gff3.pl:752
--------------------------------------
Issuing rollback() for database handle being DESTROY'd without
explicit disconnect().
Any ideas about what might be going wrong?
Also, what tables would be populated if the load was successful?
Thanks,
Genevieve
More information about the Gmod-help
mailing list