Curation Rules

The process of formulating curation rules about how a given food source or product or related information artifact should be described is ongoing. Here are the rules we have so far:

Term grammar

  • Terms are generally labeled in the singular to fit the aristotelean definition form “An X is a (member of class) Y that has feature/property/differentia Z“. This facilitates an axiomatic interpretation of the english definition. Plurals might show up in FoodOn as synonyms mainly inherited from LanguaL. Recognition of plural terms in text mining is expected in the future to be handled mainly by external resources like LexMapr and Wictionary.
    • ISO 25964 standard for thesauri notes the practice in some languages of using plurals to indicate count nouns, so that process terms can be described alongside but have different labels – the difference between “paintings” and “painting”. Ontologies don’t need this labeling approach though if terms are positioned under material entities and processes – a term’s upper-level class distinguishes the sense of the word. (Potentially a “[term label] sensu [variation semantic]” could be used to individuate the labels). (Thx to Thomas Baker for raising this issue).
  • We do not depend on the term to be text-matchable. It can be as long as necessary to differentiate it from its siblings.

Term deprecation

All FoodOn terms that are deprecated remain in FoodOn, with an owl:deprecated=true axiom, and pointer via ‘term replaced by’ relation to new ID of term. FoodOn plant and animal part terms which were inherited from LanguaL are slowly being phased out in favour of UBERON animal and PO plant anatomy terms.

The “food source” (FOODON:03411564) facet

This provides general references to organisms from which food products are derived. This branch currently covers food for humans and domesticated animals. It excludes reference to parts of organisms or organism products like milk or generic terms like egg which stretch across species – see “part of plant or animal” below for this. It doesn’t commit to “wholeness”, so an organism missing a part is allowed. (However, what minimally constitutes an organism is not defined either.)

This vocabulary is one step removed from any taxonomic nomenclature. A given food organism may have a ‘has taxonomic identifier’ that points to an NCBITaxon genus or species, but we allow for other taxonomic references too, or none at all. Some scientific names and groups occur in food source, but usually common names are used. This approach avoids a few problems: the shifting taxonomic reclassification of organisms, and the fact that ontology driven taxonomies like NCBITaxon actually don’t have terms for all organisms (NCBITaxon is dedicated to covering organisms that have had some kind of sequencing done on them). Thirdly there are cases where use of a term shifts over time so that other organisms are referenced (e.g. due to availability issues.) Referencing both past and present taxa can be done by adding to the “has taxonomic identifier” axiom.

  • References to animal as an organism, including classes that differentiate animal by breed, sexual anatomy (including castration), and age, as for example “hen: a female adult chicken”. In this way animals can be referenced without necessarily implying a food context, as when involved in veterinary processes.
  • References to plant organisms, usually with the word “plant” or “tree” at end so that in display, e.g. in the context of search engine results, it is clear that reference to the whole organism (roots, bark, leaves, etc.) or any part is occurring. If we only listed “apple” under food source, that would imply the fruit alone, preventing other parts of the organism to be involved as food products, e.g. apple blossom flowers in tea.

The anatomical “part of plant or animal” (FOODON_03420116) facet

Parts (e.g. limb) or material outputs (e.g. milk) of organisms, and generic terms like “egg” that stretch across species – are considered primarily in the anatomical “part of plant or animal” facet which “foodon food product” facet items can references in conjunction with food sources to describe food products.

The “foodon food product” (FOODON_00001002) facet

Classes in this hierarchy make reference to the fact that they are ‘derived from’ or ‘develop from part of’ or are ‘produced by’ food source organisms, or ‘has ingredient’ or ‘has part’ some other food product. Invariably a food product has some amount of food processing is involved, even if just basic harvesting, e.g. from the sea or plucking from a fruit tree.

Note that we also call “food products” foods that a person may harvest from the wild for consumption without economic consideration.

If a food source can be traded or given essentially as a whole, like a chicken, then it will also eventually be listed under the food product category (this has not yet been done in FoodOn). If specifications about how intact or whole a food source organism is are required, then these descriptors will need to be captured in an axiom that includes reference to the food source, as is done with terms like “chicken (whole, raw)”.

FoodOn’s position about term labels: OBOFoundry has a policy to spell out terms in such a way that commas and bracketed expressions are not used, so, e.g. “whole raw chicken” rather than chicken (whole, raw). We find this problematic with long lists of similar subclasses of a term (for example chicken meat food products) and especially when trying to identify terms of interest in search results. For this reason our current approach is to have the rdfs:label have the main organism name and anatomical part/qualifier, followed by other state or process descriptors (e.g. cooked, frozen, raw). Textually, this leads to semantically similar terms being ordered together, with differentiae clearly visible in brackets. The alternative is that words like “frozen” and “raw” lead the labels and control sectioning of search results.) However, to accommodate the OBOFoundry position on rdfs:label, we are considering switching our current rdfs:label content over to a second “functional label”.

  • A food product label can include both an organism and anatomical part in its name, e.g. “chicken back”. Currently we are mulling whether to allow the word “whole” in the label or to insist on it being as the lead in bracketed expression – the difference between “whole chicken (…) ” and “chicken (whole, …)”.
  • A food product having processes applied to it beyond those that were required to isolate an anatomical part likely should have those processes listed in brackets following the label, e.g. “chicken (frozen)”, “chicken (fried)”.
  • The order of listing processes should echo the order they were applied to the food, so “poultry (deboned, canned)” rather than “poultry (canned, deboned)”.
  • One should use a verb past participle to describe the output of the process, i.e. a “freezing process” outputs “frozen” food. We are reducing variations on these terms, for example “boned” and “boneless” should usually be normalized to “deboned”. However, sometimes “boneless” is used to state the condition of a food rather than implying a separate deboning process, so a judgement call may be required.
  • The food product’s axiomatization should reflect the parent class conjunction with the food process term(s) in its label, e.g. ” ‘chicken (frozen)’ equivalentTo ‘chicken meat food product’ and ‘food (frozen)’ “.
  • The food product suffix:
    • If it might be ambiguous whether a term is a food category having various closely related subclasses, or an “atomic” food item, we include the suffix “food product” to mark the broader sense. e.g. “barley flour food product” vs “barley flour”.
    • An atomic food, in other one that has no FoodOn children, and likely won’t in the future, doesn’t need a “food product” suffix in its label unless this helps to disambiguate it from search results coming from other ontologies in a lookup service.
    • Items with the “food product” can have a synonym that is the suffix-less phrase.