[Gmod-help] gmod tool db's and DBMS's

Thu Oct 21 10:31:08 EDT 2010

Hi Tom,

Wow, that's a lot of questions.  I understand that it can be a little
overwhelming, and not exactly obvious where to start; sorry about
that.  I'm going to do my best to answer your questions in line below.

Scott

On Wed, Oct 20, 2010 at 10:01 PM, Walk, Tom <Tom.Walk at ars.usda.gov> wrote:
> PostgreSQL is default for Chado
>
>
>
> http://gmod.org/wiki/Databases_and_GMOD
>
>
>
>  and MySQL is recommended for GBrowse
>
>
>
> http://gmod.org/wiki/GBrowse_Install_HOWTO
>
>
>
> That seems a bit complicated to me.  Does this mean that we should populate
> and maintain our genome DB with PostgreSQL and use MySQL for the browsing?
>

You can, but you don't need to.  Yes, the majority of users of GBrowse
use MySQL, but it supports several RDMS and schemas.  The real issue
is whether you should run GBrowse off of Chado directly or have a
separate database for driving GBrowse.  The Chado adaptor for GBrowse
doesn't have near the speed that other adaptors do, since Chado wasn't
designed as a schema for driving an interactive web application,
whereas others were.  For small to medium sized databases, this isn't
an issue, but as the database grows, people find the performance of
the Chado adaptor suffers.  To me, the main reason to run GBrowse off
of Chado directly is if the content in the database is being actively
annotated and you want to provide "live" browsing of that content.
Otherwise, dumping the data out periodically and loading it into a
dedicated GBrowse database is a fairly common thing to do, and when
you're doing that, it doesn't really matter whether they are using the
same RDMS or even if they are on the same server.

One other point about GBrowse and databases: as of GBrowse2, you can
use more that one database adaptor with a given GBrowse instance.  So
if you wanted to use GBrowse with a very fast data adaptor with data
that doesn't change, and then only pull the data that is actively
changing from your Chado database, you can do that.
>
>
> On the other hand, the Pg adaptor was supposed to be part of BioPerl 1.3.
>
>
>
> http://gmod.svn.sourceforge.net/viewvc/gmod/Generic-Genome-Browser/trunk/docs/pod/ORACLE_AND_POSTGRESQL.pod
>
>
>
> Does BioPerl now have a Pg adaptor?
>
>
>
> Can  we use Gbrowse with PostgreSQL without too much difficulty?  Has it or
> is it being tested?  Is there any support if we want to use and/or test it?

There are GBrowse adaptors for PostgreSQL for both of the "main"
GBrowse database schemas (Bio::DB::GFF and Bio::DB::SeqFeature::Store)
in BioPerl, and they've been there for a while and seem pretty stable
to me.  I know people are using them in production and I haven't
received a bug report in quite some time. The adaptor for Chado is not
part of BioPerl but can be obtained from CPAN.
>
>
>
> More basically, I see a list of adaptors
>
> http://gmod.org/wiki/GBrowse#About_Databases
>
> But at this point, I haven't seen a set of supported schema for GBrowse.
> Has anybody compiled a list of supported schema?  Do I just need to get them
> from the adaptor links above?

There is a list of adaptors here:

  http://gmod.org/wiki/GBrowse_adaptors

while not explicit, the adaptors indicate what schemas are supported.
Currently, the most popular (and fastest) is the schema for the
Bio::DB::SeqFeature::Store database.  If you have GFF3 data, this is
typically the one to go with.  Second in popularity is the schema for
the Bio::DB::GFF database, which is what I would suggest for GFF2
data.  The Chado schema is supported as well, as I wrote above.
Finally, there is an adaptor for BioSQL, but I'm not sure if it is
currently functional, as I don't know if anybody is using it (the
adaptor for GBrowse, not BioSQL itself).
>
>
>
> What about Jbrowse?  Does that do all of its queries with perl modules, ie
> BioSQL or BioPerl?  Does it require or support any DBMS?

JBrowse will talk to any database that GBrowse does, and it does
require BioPerl, but it does not absolutely require a database to run.
 When you set a JBrowse instance up, you point a processing script at
your data source, and JBrowse preprocesses all the data and saves it
on the webserver as specially formatted text files, and that is all it
needs to run.
>
>
>
> Additionally, what about Apollo?  Does it need a DBMS or does it rely only
> on JAVA classes?

Apollo does not require a RDMS either, though many people do use it
with Chado, as that makes for a nice system for annotation: edit in
Apollo, store in Chado, view in GBrowse.  Apollo will read and write
several other data formats though.
>
>
>
> Finally, since I mentioned JAVA, do any of the tools have HQL
> functionality.  Is there any HQL development in gmod?  Is there any interest
> in HQL?

Is that Hibernate?  I'm not a Java person, so I'm not too up on the
lingo.  There are groups that have developed Hibernate tools to talk
to Chado, but none of them are "sanction" by GMOD as yet, though the
few that I can think of would probably be willing to share.  There has
been interest in developing a Java ORM for Chado, but developing a
standard has been difficult.

>
>
>
> Thanks for the range of tools and any help you can provide with my
> questions.
>
>
>
> Tom Walk
>
> tom.walk at ars.usda.gov
>
>

-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                   scott at scottcain dot net
GMOD Coordinator (http://gmod.org/)                     216-392-3087
Ontario Institute for Cancer Research