[Gmod-help] Bio::DB::SeqFeature::Store issue

Ian Holmes ihh at berkeley.edu
Thu Mar 19 14:59:52 EDT 2009


Scott Cain wrote:
> Poorly formed input is the DEVIL!  :-)

Yes indeed. I hope my soul gets a good price.

Unfortunately it seems I spoke too soon about the sequence-region fix.
Again, it works only if the end co-ordinate is (mysteriously) large.

For example, the attached GFF & Perl do not work.

##gff-version 3
##sequence-region ctgA 1 8000
ctgA	bed2gff	feature	1659	1984	.	.	.	
ctgA	bed2gff	feature	3014	6130	.	.	.	
ctgA	bed2gff	feature	1660	7000	.	.	.	


#!/usr/bin/perl -w
use Bio::DB::SeqFeature::Store;
my $db = Bio::DB::SeqFeature::Store->new
 (-adaptor => "memory",
  -dsn => "testbed.gff");
my @features = $db->get_features_by_location (-seqid => "ctgA");
print @features+0;


This prints "0"; but if I change the endpoint of the sequence-region to,
say, 10000, it prints "4".


Weird, huh?



> On Thu, Mar 19, 2009 at 2:43 PM, Ian Holmes <ihh at berkeley.edu> wrote:
>> Hi Scott, thanks for helping out...
>>
>> The "##gff-version 3" line didn't make any difference, but adding a
>> "##sequence-region ctgA 1 100000" line did help.
>>
>> I was hoping to use this in an application that would be forgiving about
>> poorly-formed input data (specifically the TWiki plugin for JBrowse). I
>> guess I can quite easily sanitize the input GFF (i.e. fake a
>> ##sequence-region line for every seqid)... but I'd like to check that
>> this is, indeed, necessary rather than duplicating effort..
>>
>> Cheers,
>> Ian
>>
>>
>>
>> Scott Cain wrote:
>>> Hi Ian,
>>>
>>> That is rather bizarre behavior you're describing.  I have a few
>>> questions to get us started:
>>>
>>> Is there a line that describes what ctgA is (either a
>>> ##sequence-region line or a GFF line)?
>>>
>>> Does the file start with ##gff-version 3 ?  I think the seqfeature
>>> loader is somewhat particular about this.
>>>
>>> Scott
>>>
>>>
>>> On Thu, Mar 19, 2009 at 2:29 PM, Ian Holmes <ihh at berkeley.edu> wrote:
>>>> Hi, not sure if GMOD is the right place to direct this query, apologies
>>>> if not.
>>>>
>>>> I'm having some problems with Bio::DB::SeqFeature::Store and I hoped
>>>> someone better-versed in the internals might be able to help.
>>>>
>>>> The following GFF file seems to work fine with the "features" method of
>>>> Bio::DB::SeqFeature::Store (NB there is a tab at the end of each line,
>>>> denoting an empty group field; I have also attached this as a GFF file)
>>>>
>>>> ctgA    bed2gff feature 1659    1984    .       .       .
>>>> ctgA    bed2gff feature 3014    6130    .       .       .
>>>> ctgA    bed2gff feature 1       50000   .       .       .
>>>>
>>>> For example, the following Perl correctly calculates & prints the total
>>>> number of features in the file (3):
>>>>
>>>> use Bio::DB::SeqFeature::Store;
>>>> my $db = Bio::DB::SeqFeature::Store->new
>>>>  (-adaptor => "memory",
>>>>  -dsn => "testbed.gff");
>>>> my @features = $db->features (-seqid => "ctgA");
>>>> print @features+0;
>>>>
>>>>
>>>>
>>>> However, if I delete the last line of the GFF file (i.e. the feature
>>>> from 1 to 50000), it doesn't work (it thinks there are zero features in
>>>> the file). It also doesn't seem to mind if I change the coordinates of
>>>> the last feature, as long as the endpoint is vaguely "large" (e.g. 49000
>>>> works, but 7000 doesn't).
>>>>
>>>>
>>>> Anyone have any ideas? Is this something to do with binning and/or
>>>> default start/endpoints in the feature query method?
>>>>
>>>>
>>>> Ian
>>>>
>>>> ctgA    bed2gff feature 1659    1984    .       .       .
>>>> ctgA    bed2gff feature 3014    6130    .       .       .
>>>> ctgA    bed2gff feature 1660    7000    .       .       .
>>>>
>>>>
>>>
>>>
> 
> 
> 
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: testbed.gff
URL: <http://brie4.cshl.edu/pipermail/gmod-help/attachments/20090319/ca74464c/attachment.ksh>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: testbed.pl
URL: <http://brie4.cshl.edu/pipermail/gmod-help/attachments/20090319/ca74464c/attachment.pl>


More information about the Gmod-help mailing list