Hi Jennifer,<br><br>In my lab we've tested the combination of GBrowse and Bio::DB::GFF on a 5X coverage of C. elegans using Illumina sequencing (about 2.5 million reads). Each read is stored as an individual feature, plus a FASTA file with the sequences of each read. Under these conditions, both the display and the responsiveness were fine. However, this is pretty small compared to typical "SNP calling" experiments on vertebrates. For those, I think you will want to store the coverage information using a "wiggle" track, plus a list of the called SNPs and associated data such as allele frequencies. I am working on a specialized display for this type of data, but no release date is anticipated.<br>
<br>Lincoln<br><br><div class="gmail_quote">On Thu, May 29, 2008 at 11:41 AM, Scott Cain <<a href="mailto:cain.cshl@gmail.com">cain.cshl@gmail.com</a>> wrote:<br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
Hi Jennifer,<br>
<br>
I'm cc'ing the GMOD schema mailing list, because there have been other<br>
people wondering the same thing.<br>
<br>
First I should say that I don't really know, because no one has tried<br>
it. That said, I can tell you that the FlyBase Chado schema has several<br>
million rows in their feature table and it works for them. What you no<br>
doubt would need is a database server with enough horsepower and memory<br>
to do the job, as well as properly tuning the database server for<br>
performance.<br>
<br>
For use with GBrowse, I don't think I would advocate using Chado<br>
directly, as the Chado adaptor for GBrowse is significantly slower than<br>
the Bio::SeqFeature::Store database which is designed specifically for<br>
giving speedy query results for use with GBrowse. You could set up a<br>
system where you use Chado as your working/annotation database and then<br>
set up a periodic dump of your features to GFF3 which would get loaded<br>
into a SeqFeature::Store database for use with GBrowse.<br>
<br>
Also, in the upcoming release of GBrowse there will be support for<br>
wiggle tracks like in the UCSC browser, which will be well suited for<br>
displaying things like coverage density in a fast-rendering way.<br>
<br>
Scott<br>
<div><div></div><div class="Wj3C7c"><br>
<br>
On Thu, 2008-05-29 at 08:38 -0600, Jennifer Beane wrote:<br>
> Hi,<br>
><br>
> I'm a post-doctoral fellow in bioinformatics and my lab is about to<br>
> receive data generated from a massively parallel sequencing platform<br>
> -- Illumina's genome analyzer. The data will contain several million<br>
> short sequence reads from mRNA and microRNA. There are several<br>
> software packages to align the reads to the human genome, but I will<br>
> need to create a way to store, filter, and efficiently annotate these<br>
> reads. I'm thinking of loading the data into a chado database, and<br>
> using applications such as GBrowse to view the data. I'm wondering if<br>
> you have any experience with using GMOD software/applications for this<br>
> type of data? I'm wondering if the data will be too extensive to be<br>
> queried in a database? If you have any advice/suggestions I would<br>
> really appreciate it.<br>
><br>
> Thank you very much,<br>
> Jennifer Beane, Ph.D<br>
> Post-doctoral Fellow<br>
> Boston University School of Medicine<br>
</div></div><font color="#888888">--<br>
------------------------------------------------------------------------<br>
Scott Cain, Ph. D. <a href="mailto:cain@cshl.edu">cain@cshl.edu</a><br>
GMOD Coordinator (<a href="http://www.gmod.org/" target="_blank">http://www.gmod.org/</a>) 216-392-3087<br>
Cold Spring Harbor Laboratory<br>
<br>
</font></blockquote></div><br><br clear="all"><br>-- <br>Lincoln D. Stein<br><br>Ontario Institute for Cancer Research<br>101 College St., Suite 800<br>Toronto, ON, Canada M5G0A3<br>416 673-8514<br>Assistant: Stacey Fairfield <<a href="mailto:Stacey.Fairfield@oicr.on.ca">Stacey.Fairfield@oicr.on.ca</a>><br>
<br>Cold Spring Harbor Laboratory<br>1 Bungtown Road<br>Cold Spring Harbor, NY 11724 USA<br>(516) 367-8380 <br>Assistant: Sandra Michelsen <<a href="mailto:michelse@cshl.edu">michelse@cshl.edu</a>>