<html>
<body>
Hi Scott,<br><br>
This is to report -- that I was able to bring up the graph ... by using a
right search term, in this case "NC_007308", the chromosome
sequence id -- I have only one chromosome in the database, loaded with
your script from NCBI downloads.<br><br>
Ref. URL:
<a href="http://www.animalgenome.org/cgi-bin/gbrowse/cattle" eudora="autourl">
http://www.animalgenome.org/cgi-bin/gbrowse/cattle<br><br>
</a>Two quick questions from here (I hate to bother you with trivial
questions that I should read and learn, but you are always quick and to
the point - hope this won't take much of your time ;-):<br><br>
1- I expected to see sequences when I zoom down to bp level, but I
didn't. Wonder where should I track it... (there are 53K+ break-down
sequences in the 'fdna' table).<br><br>
2- I have in my 'ftype' table:<br>
ftypeid fmethod fsource<br>
1
region EMBL <br>
2
gap EMBL <br>
3
gene EMBL <br>
4
mRNA EMBL <br>
5
exon EMBL <br>
6
CDS EMBL <br>
7
RNA EMBL <br>
8 misc_RNA
EMBL <br>
9 V_segment
EMBL <br>
10 C_region
EMBL <br>
that I added some, but didn't see them show up.<br><br>
BTW, here is my tables look like (for one chromosome): <br>
o fattribute ( <font color="#FF0000">14</font> records )<br>
o fattribute_to_feature ( <font color="#FF0000">9922</font> records
)<br>
o fdata ( <font color="#FF0000">18906</font> records )<br>
o fdna ( <font color="#FF0000">53192</font> records )<br>
o fgroup ( <font color="#FF0000">2920</font> records )<br>
o fmeta ( <font color="#FF0000">4</font> records )<br>
o ftype ( <font color="#FF0000">10</font> records )<br><br>
Best regards,<br><br>
Zhiliang<br><br>
<br>
At 10:10 AM 9/25/2008 -0400, Scott Cain wrote:<br>
<blockquote type=cite class=cite cite="">Hi Zhiliang,<br><br>
If you do a SELECT * FROM ftype in your database, I think you will
get<br>
this as a result:<br><br>
mysql> select * from ftype;<br>
+---------+----------+---------+<br>
| ftypeid | fmethod | fsource |<br>
+---------+----------+---------+<br>
| 1 | region | Genbank
|<br>
| 2 |
gap | Genbank |<br>
| 3 | gene |
Genbank |<br>
| 4 | mRNA |
Genbank |<br>
| 5 | exon |
Genbank |<br>
| 6 |
CDS | Genbank |<br>
| 7 |
RNA | Genbank |<br>
| 8 | misc_RNA | Genbank |<br>
+---------+----------+---------+<br><br>
The "method" (types) are the only things you will be able to
display<br>
in GBrowse. The region is the chromosome, so you can display
gaps,<br>
genes, mRNAs (with exons if you use the processed_transcript<br>
aggregator/glyph), CDSes (with the cds aggregator/glyph) and RNAs.<br>
<br>
To map your other features to what is in your current database<br>
(assuming it's the same as mine), you need to have "NC_007320"
in the<br>
first column, since that is the ID used by the genbank loading
script.<br><br>
Scott<br><br>
<br>
On Wed, Sep 24, 2008 at 4:55 PM, Zhiliang Hu <zhu@iastate.edu>
wrote:<br>
> Hi Scott,<br>
><br>
> I used ncbi powerscript to download the chromosome genbank file (it
took 2<br>
> hrs 10 min) then used your bp_genbank2gff.pl to load db. Looks
like the<br>
> tables are populated:<br>
><br>
> o fattribute ( 13 records )<br>
> o fattribute_to_feature ( 7782 records )<br>
> o fdata ( 10939 records )<br>
> o fdna ( 44259 records )<br>
> o fgroup ( 1953 records )<br>
> o fmeta ( 4 records )<br>
> o ftype ( 8 records )<br>
><br>
> but I still cannot bring up the graphs:<br>
>
<a href="http://www.animalgenome.org/cgi-bin/gbrowse/cattle/" eudora="autourl">
http://www.animalgenome.org/cgi-bin/gbrowse/cattle/</a><br>
><br>
> Could you help to see if I have any key part missing from the config
file:<br>
>
<a href="http://www.animalgenome.org/hu/share/scott/cow.conf" eudora="autourl">
http://www.animalgenome.org/hu/share/scott/cow.conf</a><br>
><br>
> Thank you,<br>
><br>
> Zhiliang<br>
><br>
><br>
> At 10:33 AM 9/24/2008 -0400, Scott Cain wrote:<br>
><br>
> There seems to be a problem with BioPerl related to getting the<br>
> sequence directly from GenBank: if I download NC_007320 and then
run<br>
><br>
> bp_genbank2gff.pl --file NC_007320.gbk --dsn
dbi:mysql:test --create<br>
><br>
> it works fine in a couple of seconds. If however I run<br>
><br>
> bp_genbank2gff.pl --accession NC_007320 --dsn
dbi:mysql:test --create<br>
><br>
> I get these two lines over and over again as it runs for a long
time<br>
> (I'm letting it go now so I can see how long it will take and
what<br>
> will eventually happen):<br>
><br>
> Use of uninitialized value in pattern match (m//) at<br>
> /usr/local/share/perl/5.8.8/Bio/SeqIO/genbank.pm line 663,
<GEN1> line 115.<br>
> Use of uninitialized value in pattern match (m//) at<br>
> /usr/local/share/perl/5.8.8/Bio/SeqIO/genbank.pm line 667,
<GEN1> line 115.<br>
><br>
> Scott<br>
><br>
><br>
> On Wed, Sep 24, 2008 at 9:56 AM, Zhiliang Hu <zhu@iastate.edu>
wrote:<br>
>> I repeated on the same machine (RHEL/RedHat Linux
2.4.21-20.ELsmp, 8GB<br>
>> RAM)<br>
>> - I counted 7 minutes before its quit on "Out of
memory!" this time.<br>
>><br>
>> I then installed Bioperl/GBrowse/etc last night on another
machine (Linux<br>
>> CentOS, 8GB RAM), tried the same to run on background.<br>
>><br>
>> This morning I found the processes died away without loading the
db. I<br>
>> didn't find any core dump or else but only in the /tmp dir a
file created<br>
>> shortly after I started it:<br>
>>
<a href="http://nagrp2.ansci.iastate.edu/zhu/tmp/RJQApIbFbh.txt" eudora="autourl">
http://nagrp2.ansci.iastate.edu/zhu/tmp/RJQApIbFbh.txt</a> -- this
doesn't<br>
>> seem<br>
>> to be right because on the browser it seems to be HUGE:<br>
>><br>
>>
<a href="http://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?val=194719399&view=gbwithparts" eudora="autourl">
http://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?val=194719399&view=gbwithparts</a>
<br>
>><br>
>> Zhiliang<br>
>><br>
>><br>
>> At 11:17 PM 9/23/2008 -0400, Lincoln Stein wrote:<br>
>><br>
>> It may take a long time to run - try overnight.<br>
>><br>
>> Lincoln<br>
>><br>
>> On Tue, Sep 23, 2008 at 10:46 PM, Scott Cain
<cain.cshl@gmail.com> wrote:<br>
>> When I ran it, it spun it's wheels for a long time (30+ minutes)
and I<br>
>> killed it. I tried the analagous thing with
bp_genbank2gff.pl and<br>
>> had to kill it too. I thought it was a problem with
bioperl until<br>
>> just now, when I tried it with an E coli chromosome from genbank
and<br>
>> it worked fine (it ran a couple of minutes).<br>
>><br>
>> Scott<br>
>><br>
>><br>
>> On Tue, Sep 23, 2008 at 10:13 PM, Lincoln Stein
<lincoln.stein@gmail.com><br>
>> wrote:<br>
>>> Oh no! I've never seen a memory problem before. How long a
time elapses<br>
>>> between the original loading message and the first Out of
memory!<br>
>>> message?<br>
>>><br>
>>> Lincoln<br>
>>><br>
>>> On Tue, Sep 23, 2008 at 1:25 PM, Zhiliang Hu
<zhu@iastate.edu> wrote:<br>
>>>><br>
>>>> Scott,<br>
>>>><br>
>>>> I decide to take an "easier" approach as a
start - I will try to load<br>
>>>> NCBI and UCSC cattle genomes to GBrowse. Once that
works, I can move <br>
>>>> on with more customized data sets.<br>
>>>><br>
>>>> I have following questions in doing that:<br>
>>>><br>
>>>> 1.<br>
>>>> A technical one: When I try to load a cattle chromosome
using your<br>
>>>> 'load_genbank.pl', I got a memory problem (there is 8 GB
RAM on the<br>
>>>> machine - I bet there must a work around?)<br>
>>>><br>
>>>> > load_genbank.pl --create -dsn dbi:mysql:gb_cattle
-user --pass<br>
>>>> > --accession NC_007320<br>
>>>> Loading NC_007320...<br>
>>>> Out of memory!<br>
>>>> Out of memory!<br>
>>>> Out of memory!<br>
>>>> Segmentation fault (core dumped)<br>
>>>><br>
>>>> 2.<br>
>>>> Can I load all chromosomes into one database? Or
should I create<br>
>>>> separate databases for each chromosome? (I assume the
former but not sure).<br>
>>>><br>
>>>> 3.<br>
>>>> If I also bring in UCSC golden tracks, should I set up a
different<br>
>>>> database, Or can I put them into one db, naming UCSC
chromosomes a<br>
>>>> little<br>
>>>> differently?<br>
>>>><br>
>>>> Thank you,<br>
>>>><br>
>>>> Zhiliang</blockquote></body>
</html>