[Gmod-help] DB loading
Scott Cain
cain.cshl at gmail.com
Wed Feb 20 12:03:31 EST 2008
Hi Ed,
I'm going to put my comments/answers/questions mixed in with your text
below.
Scott
On Tue, 2008-02-19 at 17:09 -0500, Ed Johnson wrote:
> Hello Helpdesk,
>
>
>
> I’ve gotten Chado and GBrowse installed and I’m trying to load some
> EST sequences and their Blast returns. The first pass I used a
> mixture of the bp_* and gmod_ load scripts and couldn’t match the
> Blast data to the loaded sequences.
It isn't clear to me here what your goal is. I am guessing that you
want to run GBrowse directly off of Chado, but it isn't clear to me that
you need to do that. Chado is a data warehouse for organism data of
various types. If all you want to do is have a browser for your data,
you probably don't need Chado. Another database in the back end would
be faster and easier to deal with, like Bio::DB::GFF or
Bio::DB::SeqFeature::Store. See
http://www.gmod.org/wiki/index.php/GBrowse_adaptors
for a little more information.
Anyway, to directly address the question in the paragraph: using the bp_
and gmod_ load scripts together will result in data ending up in
different databases, which I don't think you want to do. For the rest
of this email, I'm going to assume that you want to use Chado. If not,
you can just ignore it.
>
>
>
> What I want to do is load a fasta file of EST sequences and a Blastx
> file run against the sequences. What tools should I be using?
For Chado, you need features both for the ESTs themselves and for the
Blastx results. To create a GFF3 file for the ESTs, you can use
gmod_fasta2gff3.pl in the chado/bin directory. It takes a fasta file
and creates a GFF3 file (optionally with the sequence at the end).
Then you need to create a GFF3 file for the Blastx results. The BioPerl
script bp_search2gff.pl should do the trick here (though it has honestly
been a little while since I used it).
You'll also need features for the 'other half' of the analysis: that is,
what the ESTs were blasted against. Presumably, that is what you want
the ace parser for below.
>
>
>
>
> I’d also like to load a TGICL assembly .ace file and display the the
> contigs. Has anyone written a parser? I’ve seen hints on various
> sites that such a thing might exist but I haven’t found anything firm.
I have no idea. You could try asking on the bioperl mailing list:
http://bioperl.org/mailman/listinfo/bioperl-l
>
>
>
> We have a database in-house and tools to accomplish the above, but
> we’re looking for more flexibility. Any help would be appreciated.
>
>
>
> Thanks,
>
>
>
> Ed
>
>
>
>
>
> Ed Johnson
>
> Scientific Computing Professional Specialist
>
> IBL - Laboratory for Genomics and Bioinformatics
>
> University Of Georgia
>
> Room 154 - IBL
>
> 110 Riverbend Road
>
> Athens, Ga 30602
>
> Phone (706) 542-1039
>
>
>
>
--
------------------------------------------------------------------------
Scott Cain, Ph. D. cain at cshl.edu
GMOD Coordinator (http://www.gmod.org/) 216-392-3087
Cold Spring Harbor Laboratory
More information about the Gmod-help
mailing list