[Gmod-gbrowse] [Gmod-help] general questions about data querying
Don Gilbert
gilbertd at cricket.bio.indiana.edu
Wed Nov 11 17:24:44 EST 2009
Dear Ryan, Dave etc.,
This is the sort if thing I do lots of, but using GFF files and perl code.
Often the perl code is a simple one-liner. If your GFF file is sorted
by location, finding overlapped genes with opposite strand is easy:
cat genes.gff | grep mRNA | perl -ne\
'($c,$t,$s,$b,$e,$p,$o)=split; if($c eq $lc and $b<$le and $o ne $lo){print "<$lgene>$_";}\
($lc,$lb,$le,$lo,$lgene)=($c,$b,$e,$o,$_);' | more
.. or maybe you want to grep exon types, or UTRs here
.. if your .gff isn't location sorted,
cat genes.gff | sort -k1,1 -k4,4n -k5,5nr | ..
You can do this in a mysql database, with sql code, but there are so many
useful sorts of pattern matching and greping with perl and unix that I find
that much easier for genome processing and analyses.
- Don
More information about the Gmod-help
mailing list