[Gmod-help] Bio::DB::SeqFeature::Store issue
Scott Cain
scott at scottcain.net
Thu Mar 19 14:45:25 EDT 2009
Poorly formed input is the DEVIL! :-)
On Thu, Mar 19, 2009 at 2:43 PM, Ian Holmes <ihh at berkeley.edu> wrote:
> Hi Scott, thanks for helping out...
>
> The "##gff-version 3" line didn't make any difference, but adding a
> "##sequence-region ctgA 1 100000" line did help.
>
> I was hoping to use this in an application that would be forgiving about
> poorly-formed input data (specifically the TWiki plugin for JBrowse). I
> guess I can quite easily sanitize the input GFF (i.e. fake a
> ##sequence-region line for every seqid)... but I'd like to check that
> this is, indeed, necessary rather than duplicating effort..
>
> Cheers,
> Ian
>
>
>
> Scott Cain wrote:
>> Hi Ian,
>>
>> That is rather bizarre behavior you're describing. I have a few
>> questions to get us started:
>>
>> Is there a line that describes what ctgA is (either a
>> ##sequence-region line or a GFF line)?
>>
>> Does the file start with ##gff-version 3 ? I think the seqfeature
>> loader is somewhat particular about this.
>>
>> Scott
>>
>>
>> On Thu, Mar 19, 2009 at 2:29 PM, Ian Holmes <ihh at berkeley.edu> wrote:
>>> Hi, not sure if GMOD is the right place to direct this query, apologies
>>> if not.
>>>
>>> I'm having some problems with Bio::DB::SeqFeature::Store and I hoped
>>> someone better-versed in the internals might be able to help.
>>>
>>> The following GFF file seems to work fine with the "features" method of
>>> Bio::DB::SeqFeature::Store (NB there is a tab at the end of each line,
>>> denoting an empty group field; I have also attached this as a GFF file)
>>>
>>> ctgA bed2gff feature 1659 1984 . . .
>>> ctgA bed2gff feature 3014 6130 . . .
>>> ctgA bed2gff feature 1 50000 . . .
>>>
>>> For example, the following Perl correctly calculates & prints the total
>>> number of features in the file (3):
>>>
>>> use Bio::DB::SeqFeature::Store;
>>> my $db = Bio::DB::SeqFeature::Store->new
>>> (-adaptor => "memory",
>>> -dsn => "testbed.gff");
>>> my @features = $db->features (-seqid => "ctgA");
>>> print @features+0;
>>>
>>>
>>>
>>> However, if I delete the last line of the GFF file (i.e. the feature
>>> from 1 to 50000), it doesn't work (it thinks there are zero features in
>>> the file). It also doesn't seem to mind if I change the coordinates of
>>> the last feature, as long as the endpoint is vaguely "large" (e.g. 49000
>>> works, but 7000 doesn't).
>>>
>>>
>>> Anyone have any ideas? Is this something to do with binning and/or
>>> default start/endpoints in the feature query method?
>>>
>>>
>>> Ian
>>>
>>> ctgA bed2gff feature 1659 1984 . . .
>>> ctgA bed2gff feature 3014 6130 . . .
>>> ctgA bed2gff feature 1660 7000 . . .
>>>
>>>
>>
>>
>>
>
--
------------------------------------------------------------------------
Scott Cain, Ph. D. scott at scottcain dot net
GMOD Coordinator (http://gmod.org/) 216-392-3087
Ontario Institute for Cancer Research
More information about the Gmod-help
mailing list