ROBOT-managed vocabulary: wine and pasta

FoodOn is adopting a new approach on some vocabulary subdomains like wines and pastas to enable adding and managing of terms through the use of spreadsheets. This way curators don’t need to know how to use the more advanced Protege ontology editor, and can jointly develop vocabulary so long as it fits a basic pattern. To do this, we use ROBOT ( spreadsheet templates which allow terms to be added, row by row, along with definitions, synonyms, author credits and even an image link.

Below is a snapshot of a Google Docs FoodOn Wine spreadsheet showing a few wines and their related food source grape plants. This same template can be copied over to other food types for similar data entry and import file creation. The first row lists field labels (ROBOT does not pay any attention to them). The second row lists special codes that ROBOT parses in order to learn what to do with the respective column’s contents. If there is no code for a column cell there, then ROBOT ignores that column. Curation notes and some other columns are ignored this way. The spreadsheet’s first column, labeled “Ontology ID“, is reserved for new or existing term ids. The “parent class” column is used for placing the given term under a parent class. Definitions are quite often derived from Wikipedia, and so a credit link is provided under the “definition source” column. The “derives from” column has a search-and-replace template that adds an axiom into the current row’s class definition, in this case connecting any wine with the grape plant it derives from. Further on, synonym, image and authorship columns exist. Image links should point to decently sized images (in the 200px – 500px width and height range).

A FoodOn curator can save (or cut and paste) this content to a file and then execute the robot program to turn it into an OWL ontology include file ready for incorporation into FoodOn. The robot command needs some extra parameters to lookup or transform ids of terms related to the columns of info being transformed into the OWL file. The –input parameter is used to bring in .owl entities that are referenced in axioms. The –prefix parameter is used to expand abbreviated namespace URLs. Assuming this is run in FoodOn’s src/ontology/imports/robot folder, the output file gets written to the parent imports/ directory. A FoodOn curator then manually sets up the import in Protege (in Active Ontology tab, Ontology Imports section.)

robot template --template wine.tsv \
  --input "../ro_import.owl" \
  --prefix "FOODON:" \
  --prefix "RO:" \
  --prefix "oboInOwl:" \
  --prefix "schema:" \
  --ontology-iri "" \
  --output wine.owl

One complexity is in getting new IDs for new terms – these IDs must be assigned by an experienced ontology curator (by adding them manually in the spreadsheet) such that they don’t accidentally duplicate ids created through Protege editor maintenance of the ontology. Above, the FOODON:00002585 and FOODON:00002640 ids are bolded just to highlight the lowest and highest new ids a curator has assigned to the spreadsheet.

The result is an OWL file one can examine on its own as shown below, where ids to parent terms and relations external to the output owl file are shown without labels.

The file can be imported and viewed in the context of FoodOn, where terms get placed in their parent contexts, and related labels appear. Also note that in context, terms inherit axioms all the way up to root terms; here “Cabernet Sauvignon wine inherits “fermented grape beverage” and other axioms.

This approach can still be finicky, for example “derives from” search and replace term label references have to be capitalized the same way as the labels given in the labels column. Adding synonyms for a language other than the ones listed requires adding a new column. Adding new axiom patterns requires an advanced curator’s help.