<html>
<body>
Hi Scott,<br><br>
I used ncbi powerscript to download the chromosome genbank file (it took
2 hrs 10 min) then used your bp_genbank2gff.pl to load db. Looks
like the tables are populated:<br><br>
o fattribute ( <font color="#FF0000">13</font> records )<br>
o fattribute_to_feature ( <font color="#FF0000">7782</font> records
)<br>
o fdata ( <font color="#FF0000">10939</font> records )<br>
o fdna ( <font color="#FF0000">44259</font> records )<br>
o fgroup ( <font color="#FF0000">1953</font> records )<br>
o fmeta ( <font color="#FF0000">4</font> records )<br>
o ftype ( <font color="#FF0000">8</font> records )<br><br>
but I still cannot bring up the graphs:<br>
<a href="http://www.animalgenome.org/cgi-bin/gbrowse/cattle/" eudora="autourl">
http://www.animalgenome.org/cgi-bin/gbrowse/cattle/<br><br>
</a>Could you help to see if I have any key part missing from the config
file:<br>
<a href="http://www.animalgenome.org/hu/share/scott/cow.conf" eudora="autourl">
http://www.animalgenome.org/hu/share/scott/cow.conf<br><br>
</a>Thank you,<br><br>
Zhiliang<br><br>
<br>
At 10:33 AM 9/24/2008 -0400, Scott Cain wrote:<br>
<blockquote type=cite class=cite cite="">There seems to be a problem with
BioPerl related to getting the<br>
sequence directly from GenBank: if I download NC_007320 and then
run<br><br>
bp_genbank2gff.pl --file NC_007320.gbk --dsn
dbi:mysql:test --create<br><br>
it works fine in a couple of seconds. If however I run<br><br>
bp_genbank2gff.pl --accession NC_007320 --dsn dbi:mysql:test
--create<br><br>
I get these two lines over and over again as it runs for a long time<br>
(I'm letting it go now so I can see how long it will take and what<br>
will eventually happen):<br><br>
Use of uninitialized value in pattern match (m//) at<br>
/usr/local/share/perl/5.8.8/Bio/SeqIO/genbank.pm line 663, <GEN1>
line<br>
115.<br>
Use of uninitialized value in pattern match (m//) at<br>
/usr/local/share/perl/5.8.8/Bio/SeqIO/genbank.pm line 667, <GEN1>
line<br>
115.<br><br>
Scott<br><br>
<br>
On Wed, Sep 24, 2008 at 9:56 AM, Zhiliang Hu <zhu@iastate.edu>
wrote:<br>
> I repeated on the same machine (RHEL/RedHat Linux 2.4.21-20.ELsmp,
8GB RAM)<br>
> - I counted 7 minutes before its quit on "Out of memory!"
this time.<br>
><br>
> I then installed Bioperl/GBrowse/etc last night on another machine
(Linux<br>
> CentOS, 8GB RAM), tried the same to run on background.<br>
><br>
> This morning I found the processes died away without loading the db.
I<br>
> didn't find any core dump or else but only in the /tmp dir a file
created<br>
> shortly after I started it:<br>
>
<a href="http://nagrp2.ansci.iastate.edu/zhu/tmp/RJQApIbFbh.txt" eudora="autourl">
http://nagrp2.ansci.iastate.edu/zhu/tmp/RJQApIbFbh.txt</a> -- this
doesn't seem<br>
> to be right because on the browser it seems to be HUGE:<br>
>
<a href="http://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?val=194719399&view=gbwithparts" eudora="autourl">
http://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?val=194719399&view=gbwithparts</a>
<br>
><br>
> Zhiliang<br>
><br>
><br>
> At 11:17 PM 9/23/2008 -0400, Lincoln Stein wrote:<br>
><br>
> It may take a long time to run - try overnight.<br>
><br>
> Lincoln<br>
><br>
> On Tue, Sep 23, 2008 at 10:46 PM, Scott Cain
<cain.cshl@gmail.com> wrote:<br>
> When I ran it, it spun it's wheels for a long time (30+ minutes) and
I<br>
> killed it. I tried the analagous thing with
bp_genbank2gff.pl and<br>
> had to kill it too. I thought it was a problem with bioperl
until<br>
> just now, when I tried it with an E coli chromosome from genbank
and<br>
> it worked fine (it ran a couple of minutes).<br>
><br>
> Scott<br>
><br>
><br>
> On Tue, Sep 23, 2008 at 10:13 PM, Lincoln Stein
<lincoln.stein@gmail.com ><br>
> wrote:<br>
>> Oh no! I've never seen a memory problem before. How long a time
elapses<br>
>> between the original loading message and the first Out of
memory! message?<br>
>><br>
>> Lincoln<br>
>><br>
>> On Tue, Sep 23, 2008 at 1:25 PM, Zhiliang Hu
<zhu@iastate.edu> wrote:<br>
>>><br>
>>> Scott,<br>
>>><br>
>>> I decide to take an "easier" approach as a start -
I will try to load<br>
>>> NCBI<br>
>>> and UCSC cattle genomes to GBrowse. Once that works, I
can move on with<br>
>>> more customized data sets.<br>
>>><br>
>>> I have following questions in doing that:<br>
>>><br>
>>> 1.<br>
>>> A technical one: When I try to load a cattle chromosome
using your<br>
>>> 'load_genbank.pl', I got a memory problem (there is 8 GB RAM
on the<br>
>>> machine<br>
>>> - I bet there must a work around?)<br>
>>><br>
>>> > load_genbank.pl --create -dsn dbi:mysql:gb_cattle -user
--pass<br>
>>> > --accession NC_007320<br>
>>> Loading NC_007320...<br>
>>> Out of memory!<br>
>>> Out of memory!<br>
>>> Out of memory!<br>
>>> Segmentation fault (core dumped)<br>
>>><br>
>>> 2.<br>
>>> Can I load all chromosomes into one database? Or
should I create<br>
>>> separate<br>
>>> databases for each chromosome? (I assume the former but not
sure).<br>
>>><br>
>>> 3.<br>
>>> If I also bring in UCSC golden tracks, should I set up a
different<br>
>>> database, Or can I put them into one db, naming UCSC
chromosomes a little<br>
>>> differently?<br>
>>><br>
>>> Thank you,<br>
>>><br>
>>> Zhiliang<br>
>>><br>
>>><br>
>>> --<br>
>>> Zhi-Liang Hu (PhD)<br>
>>> Associate Scientist,<br>
>>> Department of Animal Science,<br>
>>> Center for Integrated Animal Genomics,<br>
>>> National Animal Genome Research Program,<br>
>>> Iowa State University<br>
>>> Tel: 901-759-0643<br>
>>> Mob: 901-212-2820<br>
>>> Web:
<a href="http://www.animalgenome.org/" eudora="autourl">
http://www.animalgenome.org</a><br>
>>><br>
>>> "Not everything that counts can be counted, and<br>
>>> not everything that can be counted
counts."<br>
>><br>
>><br>
>> --<br>
>> Lincoln D. Stein<br>
>><br>
>> Ontario Institute for Cancer Research<br>
>> 101 College St., Suite 800<br>
>> Toronto, ON, Canada M5G0A3<br>
>> 416 673-8514<br>
>> Assistant: Stacey Quinn <Stacey.Quinn@oicr.on.ca ><br>
>><br>
>> Cold Spring Harbor Laboratory<br>
>> 1 Bungtown Road<br>
>> Cold Spring Harbor, NY 11724 USA<br>
>> (516) 367-8380<br>
>> Assistant: Sandra Michelsen <michelse@cshl.edu><br>
>><br>
><br>
><br>
><br>
> --<br>
>
------------------------------------------------------------------------<br>
> Scott Cain, Ph. D. cain.cshl@gmail.com<br>
> GMOD Coordinator
(<a href="http://gmod.org/" eudora="autourl">http://gmod.org/</a>)
216-392-3087<br>
> Cold Spring Harbor Laboratory<br>
><br>
><br>
><br>
><br>
> --<br>
> Lincoln D. Stein<br>
><br>
> Ontario Institute for Cancer Research<br>
> 101 College St., Suite 800<br>
> Toronto, ON, Canada M5G0A3<br>
> 416 673-8514<br>
> Assistant: Stacey Quinn <Stacey.Quinn@oicr.on.ca ><br>
><br>
> Cold Spring Harbor Laboratory<br>
> 1 Bungtown Road<br>
> Cold Spring Harbor, NY 11724 USA<br>
> (516) 367-8380<br>
> Assistant: Sandra Michelsen <michelse@cshl.edu ><br><br>
<br><br>
-- <br>
------------------------------------------------------------------------<br>
Scott Cain, Ph. D. cain.cshl@gmail.com<br>
GMOD Coordinator
(<a href="http://gmod.org/" eudora="autourl">http://gmod.org/</a>)
216-392-3087<br>
Cold Spring Harbor Laboratory</blockquote></body>
</html>