[Gmod-help] Re: fast loader for Postgresql
Scott Cain
scott at scottcain.net
Wed Jan 26 15:40:59 EST 2011
Hi Margie,
I'm going to cc this to the GBrowse mailing list to get the quesiton a
wider audience.
It's been a while since I looked at the postgres adaptor for
Bio::DB::SeqFeature::Store, but I'm reasonably sure the fast load
option is not implemented for it. If someone would like to contradict
me with either an example or a patch, please feel free. Sorry Margie.
In general, running GBrowse2 with different database backends for
different tracks is not a problem, but I understand that you may have
architectural issues outside of GBrowse that makes this impractical.
Given that, I think you may have sorted out the options available to
you (with the additional option of writing the fast loading code for
postgres yourselves and submitting it back so we can get it into
BioPerl--hey, a guy can hope!) It's likely that breaking the files up
into smaller chunks will help. The only word of caution I would add
is that if you are using the --summary option, only run it on the last
load, as on the intermediate loads, it would just be wasting compute
cycles.
Scott
On Wed, Jan 26, 2011 at 3:04 PM, Margie Manker
<manker at populargenetics.ca> wrote:
> Hi Scott/GMOD Help Desk
>
>
>
> We are using 'bp_seqfeature_load.pl' to load a Postgresql
> DB::SeqFeature::Store database. Upon load of a SNP.gff file, we found the
> task took about 1.5 weeks to complete. Thus we have researched whether there
> exists a fast loader for Postgresql; we are aware that there are fast and
> bulk loaders for MySQL.
>
>
>
> Our search (GMOD site, mailing lists, general Google search, etc.) yielded
> no information on an existing fast/bulk loader for Postgresql, which led us
> to send this email.
>
>
>
> It seems we have two options available to us to increase the speed of load:
> 1) move to a MySQL platform for our GBrowse database; 2) split large files
> to be loaded into smaller bits and load as such.
>
>
>
> Option 1 is not one we prefer. We have other Postgresql databases that
> interact with the GBrowse database, which we do not want to move to the
> MySQL platform. Also, we would prefer not to have two different database
> platforms (i.e. MySQL and Postgresql) serving the web site that features
> GBrowse because we have a query feature on the web site that interacts with
> GBrowse, both of which we would like to be served by a Postgresql database.
>
>
>
> Our question: has anyone written a fast or bulk loader for Posgresql? If so,
> could you point us in that direction? If not, do you have any other
> suggestions we may not have presented above?
>
>
>
> Please advise. Thank you for your time.
>
>
>
> Margie Manker
>
>
>
> Project Analyst
>
> The Centre for Applied Genomics
>
> The Hospital for Sick Children
>
> manker at populargenetics.ca
>
>
--
------------------------------------------------------------------------
Scott Cain, Ph. D. scott at scottcain dot net
GMOD Coordinator (http://gmod.org/) 216-392-3087
Ontario Institute for Cancer Research
More information about the Gmod-help
mailing list