[Gramene] Semi-Automated Ontology Generation within OBO-Edit

Thomas Waechter thomas.waechter at biotec.tu-dresden.de
Wed Sep 15 05:00:07 EDT 2010


Dear OBO ontology developers,

recently we contributed an "Ontology Generation Tool" as part of 
OBO-Edit (from version 2.1). This summer I presented this software 
including our evaluation for GO and MeSH at ISMB in Boston (see abstract 
and link below). Meanwhile, some people have started using the service.

If you have already tried out the tool, please provide us with feedback 
on your experience. Otherwise, I would like to invite you to give it a 
try next time. The software helps to extract *terms from text* to faster 
extend your ontology, and also to *find definitions* for your terms 
including the reference to the original URL. The tool can be found in 
the Tools section of OBO-Edit

*To gear the development towards your needs we are **interested in 
answers to the following questions:*_
_- In which domain would you like to use the Ontology Generation Tool.
- Have you been able to generated relevant vocabulary (try filtering).
- Have you been satisfied with proposed definitions?
- Have you had technical issues (e.g. some people reported that they 
cannot connect from there location)?

_My contact email:_ thomas.waechter at biotec.tu-dresden.de



/Bioinformatics. 2010 June 15; 26(12): i88–i96. /

*Semi-Automated Ontology Generation within OBO-Edit
*/Thomas Wächter and Michael Schroeder
Biotechnology Center (BIOTEC), Technische Universität Dresden, 01062 
Dresden, Germany
/
*ABSTRACT*

*Motivation:*
Ontologies and taxonomies have proven highly beneficial for biocuration. 
The Open Biomedical Ontology (OBO) Foundry alone lists over 90 
ontologies mainly built with OBO-Edit. Creating and maintaining such 
ontologies is a labour-intensive, difficult, manual process. Automating 
parts of it is of great importance for the further development of 
ontologies and for biocuration.
*
Results:*
We have developed the /Dresden Ontology Generator for Directed Acyclic 
Graphs (DOG4DAG)/, a system which supports the creation and extension of 
OBO ontologies by semi-automatically generating terms, definitions and 
parent–child relations from text in PubMed, the web and PDF 
repositories. DOG4DAG is seamlessly integrated into OBO-Edit. It 
generates terms by identifying statistically significant noun phrases in 
text. For definitions and parent–child relations it employs 
pattern-based web searches. We systematically evaluate each generation 
step using manually validated benchmarks. The term generation leads to 
high-quality terms also found in manually created ontologies. Up to 78% 
of definitions are valid and up to 54% of child–ancestor relations can 
be retrieved. There is no other validated system that achieves 
comparable results.
By combining the prediction of high-quality terms, definitions and 
parent–child relations with the ontology editor OBO-Edit we contribute a 
thoroughly validated tool for all OBO ontology engineers.
*
Availability:*
DOG4DAG is available within OBO-Edit 2.1 at http://www.oboedit.org

*Full article is available open access on PubMedCentral *
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2881373/


-- 
Thomas Waechter, Dipl.-Inf.
Bioinformatics Group (BIOTEC)
TU Dresden
Tatzberg 47-51
01307 Dresden, Germany

Phone: +49 (351) 463 40068
Email: waechter(at)biotec.tu-dresden.de




More information about the Gramene mailing list