Announcing Gramene Release 21
Claire Hebbard
cer17 at cornell.edu
Thu May 11 16:50:05 EDT 2006
Dear Cereal Researchers,
The Gramene database (www.gramene.org)
emails its registered users an announcement each time a new release is
made. This announcement includes information on new or updated data and
software at Gramene.
For the current release notes (shown below), visit
www.gramene.org/documentation/release_notes/releasenotes.html.
Data and statistics are located at
ftp://ftp.gramene.org/pub/gramene/release21/data/statistics
Sincerely,
The Gramene Database Team
*****************************************************************
This work is funded by the National Science Foundation (NSF) and the
USDA-Agricultural Research Service, and was previously funded by the USDA
Initiative for Future Agriculture and Food Systems (IFAFS). We are thankful
to numerous collaborators and contributors for help in curation and for
sharing their datasets and tools.
*****************************************************************
******************************************************************
May 2006
Gramene Release 21
New Gramene website features:
NEW! Introduction of two new modules, Pathways and Diversity
New species pages. Some pages are still being worked on for
completion, but there is now information for every species
listed. They can be accessed by clicking on the species images
at the bottom of the page.
***********************************************************************
Gramene Release Notes
***********************************************************************
Genomes
New Genomes Data
*Genome assembly: The rice genome browser has been updated with
release 4 of the TIGR Rice Pseudomolecules and Genome Annotation
including TIGR gene annotation.
*Chromosomes: Mitochrondrion and chloroplast chromosomes and
associated annotations have been added to the browser.
*Browser Tracks: Many datasets were enriched and updated,
they include:
1- 8 EST datasets of various species, updated with NCBI Genbank records
2- 13 BACend_OMAP datasets, updated with NCBI Genbank, 3 OMAP
species are new in this build (RiceGranulata, RiceOfficinalis, RiceRidleyi)
3- 5 ESTcluster_TIGR datasets from TIGR ftp site (TIGR GIs)
4- 5 ESTcluster_PlantGDB from PlantGDB website
5- Other updated datasets include:
Rice_CDS (updated from Genbank)
RiceJaponica_cDNA_KOME (updated from Genbank)
Maize_BACend (updated from Genbank)
Rice_FstTransposon (updated from Genbank)
Rice_FST-TDNA (updated from Genbank)
Rice_FSTtos17 (updated from Genbank)
Rice_BAC (these are Genbank Rice BAC/clones not used by Rice Assembly)
Due to 3rd party data processing delays,
the following tracks are omitted, but will be reinstated in the next
release;
Rice_ArrayOligo_NSF20K
Rice_MPSStag_Meyers
Rice_SAGEtag_MGOS
Rice_TE
* SNP Data: Approx. 4M Oryza sativa SNPs from dbSNP
version 125 (http://www.ncbi.nlm.nih.gov/projects/SNP) have
been mapped to the rice genome assembly. The consequence of
each SNP (synonymous, non-synonymous, UTR etc) on affected
gene model transcripts have been annotated.
* Xref mapping: The following external database identifiers for
Oryza sativa have been mapped directly to the rice gene models
via dna/protein homology;
SwissProt/TrEMBL,
RefSeq,
TIGR Gene Indices.
The following identifiers have also been mapped;
NCBI Genes,
Gramene GenesDB entries,
Gramene Pathway entries.
*New Genomes Features
**Xref mapping: The GeneView pages now display
cross-reference links to external identifiers that have
been mapped. These appear in the 'Similarity Matches'
section. See, for example, LOC_Os01g56810
(http://www.gramene.org/Oryza_sativa/geneview?gene=LOC_Os01g56810).
**SNP display: Features in the SNP track of the
ContigView display are now color coded according to their
consequence w.r.t. overlapping gene models. This color
coding extends to the GeneSNPDisplay so that it is easy
to find, for example, which SNPs affect the peptide
sequence of the gene (i.e. non-synonymous genes). See, for example,
LOC_Os01g01510
(http://www.gramene.org/Oryza_sativa/genesnpview?db=core&gene=LOC_Os01g01510).
***********************************************************************
Maps Release Notes
* We have updated our annotated rice sequence to release 4
of the TIGR rice pseudomolecules. As in previous releases,
Gramene has added its own annotations to the map, in the
form of features such as RFLPs, SSRs, ESTs from rice as well
as other species. This map was previously known as the
"Gramene TIGR Pseudomolecule Assembly of IRGSP Sequence".
To shorten the name, as well as emphasize Gramene's
additions to the map, the name of the rice sequence map
has been changed to "Rice-Gramene Annotated Nipponbare
Sequence".
**Updated Sequence Map
**Updated QTL maps
**Rice-Gramene Annotated Nipponbare Seq 2006 [formerly known as Rice-GR
TIGR Pseudomolecule Assembly of IRGSP Sequence 2005]
* We have curated additional rice QTL which have been added to four
existing QTL maps.
Rice-JRGP Koshihikari/Kasalath RIL RFLP QTL 2003
Rice-Integrated IR64/Azucena DH QTL
Rice-CNHZAU Zhenshan97/Minghui63 RI QTL 2002
Rice-Aberdeen Bala/Azucena 2002
***********************************************************************
Markers
The marker database now contains a total of 11,832,129 markers.
Those markers previously classified into "BAC end sequence" and
"Tos17" have been consolidated into "GSS", which contains all
Poaceae genomic sequences.
Marker breakdown by type
AFLP - 950
Centromere - 12
Clone - 2,033,658
EST - 3,675,691
EST Cluster - 1,325,059
Gene - 8,474
Gene Model - 57,503
Gene Primer - 19
GSS - 4,605,001
mRNA - 99,054
Primer - 34
RAPD - 135
RFLP - 7,822
SSR - 16,835
STS - 65
Undefined - 1,817
Marker breakdown by species
Barley (Hordeum spp.) - 637,879
Maize (Zea spp.) - 3,365,983
Oat (Avena spp.) - 8,226
Rice (Oryza spp.) - 4,395,144
Rye (Secale spp.) - 4,91,841
Sorghum (Sorghum spp.) - 1,401,556
Sugarcane (Saccharum spp.) - 339,661
Wheat (Triticum spp. + Aegilops spp.)- 1,062,830
Other - 129,009
***********************************************************************
Proteins
The Gramene protein database provides curated information on
SP-Trembl entries from family Poaceae (Grasses). The
annotations include Pfam, Prosite, TMHMM (for transmembrane
domains), TargetP and Predotar (plastid, mitochrondrial
and secretory pathway targeting) and Interpro assignments.
Various ontologies such as Gene Ontology (GO), Plant
Ontologies (PO+GRO) and Environment Ontology (EO) are used
to provide functional characteristics.
Total number of proteins: 76745
Proteins from SWISS-PROT: 2367
Proteins from TrEMBL: 74378
Almost all the rice proteins encoded by plastid and mitochondrial genomes
are annotated.
***********************************************************************
Ontologies
Various ontologies and their associations were updated.
For more details on different types of ontologies please
visit the ontology home
The ontologies provided are:
*Gene Ontology
*Plant Ontology
*Gramene Plant Growth Stage Ontology
*Trait Ontology
*Gramene Taxonomy Ontology
*Environment Ontology
**Gramene taxonomy ontology now has associations to the
species in marker database. e.g. Oryza
(http://www.gramene.org/db/ontology/search_term?id=GR_tax:013655)
has 19 species associated in marker db.
**The marker libraries are now associated to Ontology terms from
Plant structure (PO), cereal growth (GRO) and Environment
ontologies (EO). e.g. plant structure leaf
(http://www.gramene.org/db/ontology/search_term?id=PO:0009025)
has 77 marker libraries, suggesting that individual entries from
this library were expressed in the plant structure leaf.
***********************************************************************
Genes and Alleles Release Notes
Continually growing, Gramene release 21 presents a further
improvement based on the previously reorganized gene
search interface.
Content:
* Rice -- 1798 (260 New genes)
* Maize -- 6676
* Total -- 8474
1657 of 1798 (92%) rice genes have literature references.
*The genes present on rice plastid (157) and
mitochondrial (59) genomes were added and curated.
*Search all rice plastid genes by visiting
http://www.gramene.org/db/genes/search_gene?query=plastid&search_field=chromosome&gene_type_id=&species=1
*Search all rice mitochondrial genes by visiting
http://www.gramene.org/db/genes/search_gene?query=mito*&search_field=chromosome&gene_type_id=&species=1
Search improvements:
Genes can searched by their types, not sequenced, CDS
(Protein coding), rRNA (Ribosomal RNA), tRNA (Transfer RNA),
Pseudogene (non-functional), and "Not classified".
For those genes identified by classical genetics method but
have no sequenced loci associated with them, please use
the gene type 'Not sequenced'.
Additional filters for 'has phenotype' can be applied to your
gene searches. Selecting 'has phenotype' search those genes
that have phenotypes.
A more detailed genes database statistics report can be found
at ftp://ftp.gramene.org/pub/gramene/release20/data/statistics/gene_statistics
***********************************************************************
QTL Release Notes
The Gramene QTL database includes a total of 10,495 QTL
identified for numerous agronomic traits in rice, maize,
barley, oat, wheat, pearl millet, foxtail millet and wild rice.
Almost 350 new rice QTLs related with low nitrogen
tolerance, tissue culture performance, and other important
traits in rice from recent publications have been curated
and added to the database.
Another improvement in this release is the addition of links
to the QTL that have been originally integrated from
GrainGenes and MaizeGDB which will allow the users to go
to the original databases to get more detailed information
if necessary.
A more detailed QTL database statistics report can be found
at ftp://ftp.gramene.org/pub/gramene/release20/data/statistics/qtl_statistics.
***********************************************************************
Pathways Release Notes
The Pathway tool is a web based tool for viewing gene
annotations mapped to various biochemical pathways in plants,
rice (Oryza sativa) and Arabidopsis thaliana and bacteria
E. coli. This tool also allows you to draw comparisons
among the data sets from the above three species.
This is the version 1.0 of the rice pathways. The rice
pathways are also called (RiceCyc) and are curated by
Gramene. The rice genes and their annotation used in this
analyses were based on release 4 of the TIGR's assembly
of Oryza sativa japonica cv.Nipponbare genome sequenced
by IRGSP.
Arabidopsis thaliana and E. coli pathways were imported
from AraCyc (http://www.arabidopsis.org/tools/aracyc/)
and EcoCyc (http://www.ecocyc.org/) project sites.
The RiceCyc ver 1.0 has following contents.
Pathways: 316
Enzymatic Reactions: 1687
Transport Reactions: 5
Polypeptides: 43172
Protein Complexes: 4
Enzymes: 10387
Transporters: 62
Compounds: 1265
A peptides link to the gene model pages in rice genome. Similarly
if the gene model has been mapped to a biochemical pathway it
links to particular reaction in pathway database from the genome
browser.
This tool also allows you to upload your own gene expression data
on the pathways to visualize an overview of the cellular level
expression profile. For more info please visit
http://www.gramene.org/pathway/.
***********************************************************************
Diversity Release Notes
The Gramene Genetic Diversity database contains SSR allelic
data for rice, SNP data for Wheat, and phenotypic, SSR and
SNP data for maize. Allelic variation on loci of multiple
germplasms of a species and genome-wide allelic variation
of germplasms can be viewed by searching for locus/marker
name, germplasm name or accession number.
Wheat and maize data was obtained from their respective
genome projects. For rice, the data was obtained from a paper
published by the McCouch rice group. See the abstract
at http://www.gramene.org/db/literature/pub_search?ref_id=11020.
With this release,
Rice: 234 germplasms; 169 SSR markers.
Wheat: 48 germplasms; 3802 SNP markers.
Maize: 2793 germplasms; 520 SSR and 897 SNP markers.
For a more detailed summary of the datasets, see
http://www.gramene.org/diversity/summary.html
Please visit the Gramene Diversity Home Page as we continue
to develop our data sources and analysis software tools.
***********************************************************************
For more information on the gramene modules, review the
most recent Gramene Publications:
Gramene: A genomics and genetics resource for rice.
Rice Genetics Newsletter, 2006, Vol. 22, No. 1. 9-16.
(http://www.shigen.nig.ac.jp/rice/rgn/vol22/v01.html)
Gramene: Sowing the seeds of genomics research.
eCALSconnect, March 2006, Vol. 12-3.
(http://cals.cornell.edu/cals/public/comm/pubs/ecalsconnect/vol12-3/features/gramene.cfm)
*****************************************************************************
*****************************************************************************
Claire Hebbard
Gramene Outreach Coordinator
G15 Bradfield Hall
Ithaca, NY 14853
USA
www.gramene.org
More information about the Gramene-announce
mailing list