[Gmod-help] contigs to reference sequence

Scott Cain cain.cshl at gmail.com
Wed Nov 5 22:08:12 EST 2008


Hi Denis,

I'm cc'ing this to the GBrowse mailing list because plenty of people
who use GBrowse deal with this sort of thing.

You have two options, but I'll go with the one that I think is
probably best for you first:

1. Create a supercontig for each of your chromosomes.  That is,
concatenate each of the contigs that belong to a chromosome separated
by a fixed number of Ns.  Then create a GFF file that contains the
features with the coordinates mapped to the supercontig.  I believe
there is a script in BioPerl that will help you do that, but I don't
know off hand.  Perhaps another reader will.  You can then use the
resulting GFF file with SeqFeature::Store.

2. If you plan on doing more work with the annotations, like editing
gene models, you might want to consider loading the data into a Chado
database.  While other GBrowse adaptors cannot do recursive coordinate
mapping, the Chado adaptor can (at a rather significant performance
penalty).  That is, I know the coordinates of this gene on a contig,
and I know the coordinates of this contig on the chromosome, let the
adaptor figure out the coordinates of the gene on the chromosome.

Scott



On Wed, Nov 5, 2008 at 7:40 PM, Denis O'Meally <denis.omeally at anu.edu.au> wrote:
> Hi,
>
> I've recently setup a local gbrowser for the wallaby genome project and have
> a demo system up and running with the tutorials and a few of our own wallaby
> annotations. One thing that I' uncertain about is how to store a reference
> sequence for a chromosome thats composed of multiple contigs (we have an
> integrated physical and linkage map, so can place the contigs on
> chromosomes). I'm currently just using the memory adaptor (but will use
> mysql) with Bio::DB::SeqFeature::Store. Any advice you can provide, or
> tutorial you can direct me to would be very much appreciated.
> Cheers,
> Denis
>
> --
> Denis O'Meally
> Comparative Genomics Group
> Research School of Biological Sciences
> The Australian National University
> GPO Box 475, Canberra, ACT 2601
>  Tel: +61 2 6125 2371
> Email: denis.omeally at anu.edu.au
>
>



-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                   scott at scottcain dot net
GMOD Coordinator (http://gmod.org/)                     216-392-3087
Ontario Institute for Cancer Research



More information about the Gmod-help mailing list