Abstracts :: Nigam Shah

How bio-ontologies enable open science

Nigam Shah, Mark Musen
Stanford Center for Biomedical Informatics Research, Stanford University

In recent years, computing has drastically changed our ability to produce, analyze data and communicate scientific information.  In the course of their work, biomedical investigators must integrate a growing amount of diverse information. It is not possible for scientists to bring together this large amount of information without the aid of computers. Researchers have turned to ontologies—which allow representation of experimental results in a structured form—to facilitate interoperability among databases by indexing them with standard terms as well as to create knowledge bases that store large amounts of knowledge in a structured manner. If the ontologies are well‐designed, then the resulting knowledge bases can be used to retrieve relevant facts, to organize and interpret disparate knowledge and to evaluate hypotheses posited by scientists. Ontologies provide researchers with both the structure into which experimental results, facts and findings have to be put into as well as the words (or terms) to be used in populating the structure with instances. However, this task of creating structured content runs into a curation bottleneck very fast and it has been shown that “controlled manual curation” will not scale [1] .

We believe that for “open science” to really take off collaborative curation platforms are going to be necessary and (semi‐)automation of curation is going to be necessary. In the proposed presentation we will describe the current status and emerging trends in the use of bio‐ontologies to facilitate open exchange and enable open science.

We will review existing collaborative curation projects uch  Owikis a B [2], ProteinPedia [3], GONuts [4], Openwetware [5] and Wikipathways [6] that attempt to crowd‐source the creation and curation of scientific knowledge. We will review the role and promise of semantic web technologies in enabling efforts such as SWAN [7], neurocommons and science commons [8, 9]. We will also review efforts from the industry side, such as the Concept Web by Knewco [10], that provide enhanced platforms for collaborative knowledge management, in the hope that revenue generating services can be built around the core knowledge generated by the community. We will discuss the current state of the art and the key ingredients needed for such open, collaborative curation platforms to succeed: 1) Proper use of bioontologies and 2) Appropriate use of Natural Language Processing in the curation workflow. In order to understand these issues in detail, we will describe the tools [11] developed by the National Center for Biomedical Ontology to enable the use of ontologies in collaborative platforms; and we will review the latest developments in the synergy between using ontologies with NLP techniques and their combined use in open collaborative workflows [12, 13]. We will review highlights from the commercial sector, such as Semantic Hacker [14] and Semantic Portal [15], towards using ontologies and NLP in enabling such collaborative workflows (in the non‐biological domain). We will conclude with a discussion on the issues of the merits of crowd sourcing in sparse user groups and whether the social barriers will prove insurmountable for collaborative workflows to take off in “specialty areas” such as biomedical research.

1 WA Baumgartner Jr. et al, Manual curation is not sufficient for annotation of genomic
databases. Bioinformatics 2007 23(13):i41‐i48. Presented at ISMB 2007
2 http://www.bowiki.net
3 http://www.humanproteinpedia.org/
4 http://gowiki.tamu.edu/wiki/index.php/Main_Page
5 http://openwetware.org/wiki/Main_Page
6 http://www.wikipathways.org/index.php/WikiPathways
7 http://swan.mindinformatics.org/index.html
8 http://sciencecommons.org/
9 http://sciencecommons.org/projects/data/
10 http://www.knewco.com
11 http://bioontology.org/tools.html
12 This section of the talk will be based on a Keynote at ‘Bridging Ontologies and Text Mining (BOTM)’ event at EBI
13 http://collabrx.com/index.php
14 http://www.semantichacker.com
15 http://try.ontos.com/

