[Gramene] Semi-Automated Ontology Generation within OBO-Edit
Thomas Waechter
thomas.waechter at biotec.tu-dresden.de
Wed Sep 15 05:00:07 EDT 2010
Dear OBO ontology developers,
recently we contributed an "Ontology Generation Tool" as part of
OBO-Edit (from version 2.1). This summer I presented this software
including our evaluation for GO and MeSH at ISMB in Boston (see abstract
and link below). Meanwhile, some people have started using the service.
If you have already tried out the tool, please provide us with feedback
on your experience. Otherwise, I would like to invite you to give it a
try next time. The software helps to extract *terms from text* to faster
extend your ontology, and also to *find definitions* for your terms
including the reference to the original URL. The tool can be found in
the Tools section of OBO-Edit
*To gear the development towards your needs we are **interested in
answers to the following questions:*_
_- In which domain would you like to use the Ontology Generation Tool.
- Have you been able to generated relevant vocabulary (try filtering).
- Have you been satisfied with proposed definitions?
- Have you had technical issues (e.g. some people reported that they
cannot connect from there location)?
_My contact email:_ thomas.waechter at biotec.tu-dresden.de
/Bioinformatics. 2010 June 15; 26(12): i88–i96. /
*Semi-Automated Ontology Generation within OBO-Edit
*/Thomas Wächter and Michael Schroeder
Biotechnology Center (BIOTEC), Technische Universität Dresden, 01062
Dresden, Germany
/
*ABSTRACT*
*Motivation:*
Ontologies and taxonomies have proven highly beneficial for biocuration.
The Open Biomedical Ontology (OBO) Foundry alone lists over 90
ontologies mainly built with OBO-Edit. Creating and maintaining such
ontologies is a labour-intensive, difficult, manual process. Automating
parts of it is of great importance for the further development of
ontologies and for biocuration.
*
Results:*
We have developed the /Dresden Ontology Generator for Directed Acyclic
Graphs (DOG4DAG)/, a system which supports the creation and extension of
OBO ontologies by semi-automatically generating terms, definitions and
parent–child relations from text in PubMed, the web and PDF
repositories. DOG4DAG is seamlessly integrated into OBO-Edit. It
generates terms by identifying statistically significant noun phrases in
text. For definitions and parent–child relations it employs
pattern-based web searches. We systematically evaluate each generation
step using manually validated benchmarks. The term generation leads to
high-quality terms also found in manually created ontologies. Up to 78%
of definitions are valid and up to 54% of child–ancestor relations can
be retrieved. There is no other validated system that achieves
comparable results.
By combining the prediction of high-quality terms, definitions and
parent–child relations with the ontology editor OBO-Edit we contribute a
thoroughly validated tool for all OBO ontology engineers.
*
Availability:*
DOG4DAG is available within OBO-Edit 2.1 at http://www.oboedit.org
*Full article is available open access on PubMedCentral *
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2881373/
--
Thomas Waechter, Dipl.-Inf.
Bioinformatics Group (BIOTEC)
TU Dresden
Tatzberg 47-51
01307 Dresden, Germany
Phone: +49 (351) 463 40068
Email: waechter(at)biotec.tu-dresden.de
More information about the Gramene
mailing list