[Gmod-help] Error: chr1 doesn't have a primary id

Scott Cain scott at scottcain.net
Wed Sep 22 09:40:23 EDT 2010


Hi Nigel,

The GFF3 spec specifically forbids using IDs across different GFF3
files: they are not intended to convey any information that extends
beyond identifying parent-child relationships in a file.

With that said, are you sure you need to express the cytoband features
that way?  I haven't used the ideogram glyph in a few years, but I
would think that just having the cytoband feature exist on a
chromosome (which is what you are saying when you give the reference
sequence (the first column) is "chr1") would be sufficient.

Scott


On Tue, Sep 21, 2010 at 4:59 PM, Nigel Wilson <nigel.wilson at sickkids.ca> wrote:
> Hi,
>
>
>
> I am attempting to load 3 gff3 files into a postgres
> Bio::DB::SeqFeature::Store database using the bp_seqfeature_load.pl script
> that comes with bioperl.
>
> My files are: chromosomes.gff3, ideogram.gff3 and study_2.gff3, being loaded
> in that order. The chromosomes.gff3 file loads without error. However, when
> I attempt to load either of the other 2 files, I receive the following
> error:
>
>
>
> ------------- EXCEPTION: Bio::Root::Exception -------------
>
> MSG: chr1 doesn't have a primary id
>
> STACK: Error::throw
>
> STACK: Bio::Root::Root::throw
> /usr/lib/perl5/site_perl/5.8.8/Bio/Root/Root.pm:472
>
> STACK: Bio::DB::SeqFeature::Store::GFF3Loader::build_object_tree_in_tables
> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/SeqFeature/Store/GFF3Loader.pm:720
>
> STACK: Bio::DB::SeqFeature::Store::GFF3Loader::build_object_tree
> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/SeqFeature/Store/GFF3Loader.pm:699
>
> STACK: Bio::DB::SeqFeature::Store::GFF3Loader::finish_load
> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/SeqFeature/Store/GFF3Loader.pm:343
>
> STACK: Bio::DB::SeqFeature::Store::Loader::load_fh
> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/SeqFeature/Store/Loader.pm:354
>
> STACK: Bio::DB::SeqFeature::Store::Loader::load
> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/SeqFeature/Store/Loader.pm:243
>
> STACK: /usr/bin/bp_seqfeature_load.pl:135
>
> -----------------------------------------------------------
>
>
>
> I loaded the chromosomes.gff3 file with the following command:
>
>
>
> bp_seqfeature_load.pl -d dbname=gbtest -a DBI::Pg -u xxx -p xxx -c -v
> chromosomes.gff3
>
>
>
> And for the ideogram.gff3 and study_2.gff3 files, I use the commad:
>
>
>
> bp_seqfeature_load.pl -d dbname=gbtest -a DBI::Pg -u xxx -p xxx –v
> <filename.gff3>
>
>
>
> I have tried to correct this error by defining the regions using the
> ##sequence-region pragma, however that results in a duplication of the
> chromosome region in my database. Is there a way to allow loading of
> multiple gff3 files, whose parent IDs are already loaded into the database?
>
>
>
> (P.S. program versions, example files are located past the signature)
>
>
>
> Thanks
>
>
>
> Nigel Wilson
> _________________________________________________
> Research Student
> The Centre for Applied Genomics
>
> The Hospital for Sick Children, MaRS Building - East Tower
> ­­­­­­­­­­­(: 416.813.7032
> *: 101 College St., Rm 14-701, Toronto, Ontario  M5G 1L7
>
>
>
>
>
> Info
>
>
>
> BioPerl Version -> latest bioperl-live
>
> Gbrowse version: 2.10
>
>
>
> File Examples:
>
> Chromosomes.gff3
>
>
>
> ##gff-version 3
>
> ##Index-subfeatures 1
>
>
>
> chr1    hg18    chromosome      1       247249719       .       .
> .       ID=chr1;Name=Chr1
>
> chr10   hg18    chromosome      1       135374737       .       .
> .       ID=chr10;Name=Chr10
>
> chr11   hg18    chromosome      1       134452384       .       .
> .       ID=chr11;Name=Chr11
>
> chr12   hg18    chromosome      1       132349534       .       .
> .       ID=chr12;Name=Chr12
>
>
>
> ideogram.gff3
>
> ##gff-version 3
>
> ##Index-subfeatures 1
>
>
>
> chr1    UCSC    cytoband    1   2300000 .   .   .
> Parent=chr1;Name=Cytoband:1p36.33;Alias=p36.33;stain=gneg;
>
> chr1    UCSC    cytoband    2300001 5300000 .   .   .
> Parent=chr1;Name=Cytoband:1p36.32;Alias=p36.32;stain=gpos25;
>
> chr1    UCSC    cytoband    5300001 7100000 .   .   .
> Parent=chr1;Name=Cytoband:1p36.31;Alias=p36.31;stain=gneg;
>
> chr1    UCSC    cytoband    7100001 9200000 .   .   .
> Parent=chr1;Name=Cytoband:1p36.23;Alias=p36.23;stain=gpos25;
>
> chr1    UCSC    cytoband    9200001 12600000    .   .   .
> Parent=chr1;Name=Cytoband:1p36.22;Alias=p36.22;stain=gneg;
>
>
>
> study_2.gff3
>
> ##gff-version 3
>
> ##Index-subfeatures 1
>
> chrX    DGV2    sample_level_variant    136348080   136348238   .   +   .
> ID=abc_41;Parent=chrX;Name=abc_41;variant_type=CNV;gender=M;study=test 2009;
>
> chr7    DGV2    sample_level_variant    7074153 7074960 .   +   .
> ID=abc_4025;Parent=chr7;Name=abc_4025;variant_type=CNV;gender=M;study=test
> 2009;
>
> chr2    DGV2    sample_level_variant    41069334    41070418    .   +   .
> ID=abc_4023;Parent=chr2;Name=abc_4023;variant_type=CNV;gender=M;study=test
> 2009;
>
> chr1    DGV2    sample_level_variant    71513720    71514441    .   +   .
> ID=abc_3923;Parent=chr1;Name=abc_3923;variant_type=CNV;gender=M;study=test
> 2009;
>
>
>
> ________________________________
> This e-mail may contain confidential, personal and/or health
> information(information which may be subject to legal restrictions on use,
> retention and/or disclosure) for the sole use of the intended recipient. Any
> review or distribution by anyone other than the person for whom it was
> originally intended is strictly prohibited. If you have received this e-mail
> in error, please contact the sender and delete all copies.
>



-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                   scott at scottcain dot net
GMOD Coordinator (http://gmod.org/)                     216-392-3087
Ontario Institute for Cancer Research




More information about the Gmod-help mailing list