[Gmod-help] Loading WormBase GFF3 into Chado
Florian Wagner
email at florianwagner.eu
Wed Aug 25 19:51:56 EDT 2010
Hi,
I could use some help here...I'm trying to load the GFF3 file of the
latest C. elegans genome release from WormBase (WS217) into Chado, using
gmod_bulk_load_gff3.pl --gfffile c_elegans.WS217.gff3 --fastafile
c_elegans.WS217.dna.fa --organism CELE --dbname chado
This gives the error message:
Preparing data for inserting into the chado database
(This may take a while...)
Unable to find srcfeature I in the database.
... at /usr/local/share/perl/5.10.1/Bio/GMOD/DB/Adapter.pm line 4555
... called at /usr/local/bin/gmod_bulk_load_gff3.pl line 841
Abnormal termination, trying to clean up...
I've read Scott's answer to a similar problem here:
http://osdir.com/ml/science.biology.gmod.gbrowse/2008-07/msg00014.html
However, adding the chromosomes manually to the top of the GFF file, as
suggested there, does not solve the problem. Actually, the original GFF
file comes with these kinds of lines (sowhere in the file), e.g.:
I Reference chromosome 1 15072423 . + . ID=I;Name=I
So this doesn't seem to be a problem with the GFF file, but with the
loader. Do you have any ideas how to fix this?
Best, Florian
ps.
I'm using chado 1.11 and bioperl 1.6.1.
The GFF3 file starts like this:
##gff-version 3
##sequence-region I 1 15072423
##sequence-region II 1 15279345
##sequence-region III 1 13783700
##sequence-region IV 1 17493793
##sequence-region MtDNA 1 13794
##sequence-region V 1 20924149
##sequence-region X 1 17718866
...
The original annotation file is available here:
ftp://ftp.wormbase.org/pub/wormbase/genomes/c_elegans/genome_feature_tables/GFF3/c_elegans.WS217.gff3.gz
More information about the Gmod-help
mailing list