[Gmod-schema] [Gmod-help] gmod bulk upload
Scott Cain
cain.cshl at gmail.com
Thu May 8 17:30:09 EDT 2008
Hi Josh,
The stuff about Pg 8.3 is good to know. I am concerned that tools that
have been written for Chado and GBrowse will fail with Pg 8.3 though
because of changes in the way casting is done. Of course, I haven't
done any testing of it yet :-)
Scott
On Thu, 2008-05-08 at 17:09 -0400, Josh Goodman wrote:
> I just wanted to add that since that discussion about multibyte encodings, PostgreSQL has improved
> its LIKE/ILIKE query performance when operating on multibyte encodings in the 8.3 series
> (http://www.postgresql.org/docs/8.3/interactive/release-8-3.html). I have not yet run any
> benchmarks to compare performance in Chado.
>
> Your cumulative slow down as more features are loaded does not sound like an encoding issue to me
> based on my experience.
>
> I would instead start looking at hardware bottlenecks or PostgreSQL config options that might be set
> too low. I don't know enough about gmod_bulk_load_gff3.pl to say if there are potential problems there.
>
> Josh
>
>
> Dave Clements, GMOD Help Desk wrote:
> > Dear Stephen,
> >
> > I don't think this (lack of) performance is typical. It suggests to
> > me something in the database is going awry. It could be any number of
> > things (see http://gmod.org/PostgreSQL_Performance_Tips for some of
> > them).
> >
> > It could also be the default encoding for the database. If Postgres
> > is using a multibyte character encoding then that can slow things down
> > by a couple orders of magnitude. See
> >
> > http://sourceforge.net/mailarchive/forum.php?thread_name=200711082012.lA8KCtV15976%40cricket.bio.indiana.edu&forum_name=gmod-schema
> >
> > for a discussion of that.
> >
> > Has anything improved or have you discovered anything new since you
> > sent the e-mail?
> >
> > I am also cross-posting this to the GMOD Schema list as people there
> > may have suggestions.
> >
> > Thanks,
> >
> > Dave C
> > GMOD Help Desk
> >
> > On Wed, May 7, 2008 at 7:10 PM, Stephen Ficklin
> > <FICKLIN at exchange.clemson.edu> wrote:
> >> Hello,
> >>
> >>
> >>
> >> We have an installation of chado that has about 7million records in the
> >> feature table. We're uploading our data as GFF files using the
> >> gmod_bulk_load_gff3.pl and we find that it is taking a very long time. It
> >> has taken about 28 hours to upload 190,220 entries in two GFF files. Is
> >> this normal? It seems the more entries we add to the database the slower
> >> these uploads become. We still have over a million more records to add to
> >> the database. Is there any way we can speed up this upload?
> >>
> >>
> >>
> >> Thanks,
> >>
> >> Stephen Ficklin
> >>
> >> Clemson University Genomics Institute
> >>
> >> http://www.genome.clemson.edu/
> >>
> >> 864-656-4298
> >
> > -------------------------------------------------------------------------
> > This SF.net email is sponsored by the 2008 JavaOne(SM) Conference
> > Don't miss this year's exciting event. There's still time to save $100.
> > Use priority code J8TL2D2.
> > http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
> > _______________________________________________
> > Gmod-schema mailing list
> > Gmod-schema at lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/gmod-schema
--
------------------------------------------------------------------------
Scott Cain, Ph. D. cain at cshl.edu
GMOD Coordinator (http://www.gmod.org/) 216-392-3087
Cold Spring Harbor Laboratory
More information about the Gmod-help
mailing list