Today, in science, especially in information technology, the word ontology is a hot ride. In short, an Ontology is the  specification of a concept. The idea has grown almost to the point of becoming a buzz word for academics and professionals in the computer science field, and yet a big part of the industry ignores the subject for lack of friendly documentation or understanding that describes it in bogus terms, why is important and how it can change computing for the better.

The word appeared for the first time in the Oxford English Dictionary in 1989. Because it’s a relatively new word for English-speaking folks, the word itself it gets in the way of story it tells. In reality it has been around for quite some time in society.

The philosophical study of existence, “what is real and what is not”, it’s been around for centuries. We can find evidence of the questioning of nature and reality all the way back to the Pre-Socratic era, with philosopher Parmenides of Ela. Parmenides is most known for a poem he wrote called “On Nature” (read the poem here). The poem describes two different perspectives of the same reality, but it zeroes in one powerful idea, that no matter how different appearances of that ‘that it is’ (he calls it ‘the way of opinion’), the truth about ‘it’ does not change (‘the way of the truth’). In a nutshell, this is the first recorded attempt to formalize the realization that existential things don’t change regardless of the lexicon or language used to describe them. Many more developed their own thesis on how to define reality. Plato also made notable contributions to the field of Ontology, and his later disciple Aristotle put a dent in this universe with his works Categories and Metaphysics.

Why is this important today? Because all natural science fields that describe elements of the real world, already have their own ontologies, but this is not the case for Computer Science and Information Technology. Physics, Chemistry and Biology all have a very clear lexicon or dictionary that describes their scientific domains. But we have yet to define an Ontology that describes the world we present through software. When building information systems, different authors, developers and companies declare the same entity ‘that is’ not as the entity itself, but instead as one of its appearances. What we end up with is a lot of unnecessary repetition, corrupted data structures for entities and unnecessary computations made for the sake of mapping appearances that represent the same entity. A call for a Global Ontology has been the topic of many academics for a long time, and in many ways considered the holy grail of information sciences.

Mathematics, as the universal language, describes abstractions and logical reasoning to determine the truthfulness of an assumption. We do it with the use of specialized notation, like numbers and shapes that do not have a tangible form. No author, developer, company or human being in the planet will argue what the number ‘3’ represents. Mathematics provides the foundation for all Ontologies of any other domain definable by humanity. I couldn’t put it any better than Galileo Galilei:

The universe cannot be read until we have learned the language and become familiar with the characters in which it is written. It is written in mathematical language, and the letters are triangles, circles and other geometrical figures, without which means it is humanly impossible to comprehend a single word. Without these, one is wandering about in a dark labyrinth

Going back to Ontology in the Information Sciences, some questions remain unanswered:

  • What are the fundamental objects or structures we ought to define to represent the tangible and abstract concepts from a specific domain?
  • How can we successfully share and relate objects from different domain ontologies?
  • How can we define ontology structures in a way they are effective for operational and usable digital communications?

The biggest challenge in information science with respect of the use of ontologies, is that of establishing a base line agreement in the industry to use a common lexicon and vocabulary consistent with the theory specified by the a particular domain ontology. A Global Ontology would be defined as the aggregation of all domain ontologies, where a domain ontology represents the abstractions and tangible objects of part of the world or a specific knowledge domain.

Competition begs to be mentioned in these lines. The mammoths in the software industry have shown more interest in sticking their guns out for discriminator structures under the same ontological domain with their competitors. For example, Google Maps, Bing Maps and MapQuest all offer services in the GIS domain, yet they’ve decided not to share the same vocabulary and lexicon to name their GIS objects. Think about this for a minute, if these companies decided to share a global GIS schema, then their only discriminator really would be the quality of their service… but that’ll make it too easy for developers to switch sides; so they decide to give their own twist on unique vocabulary. The result is arbitrary mappings for “State”, “Province”, “StateProvince” and “Municipality”, each with multiple data types, sizes and formatting, ultimately adding layers of unnecessary complexity to such a simple concept like that ‘that it is’.

This is already too long of a post, so I’ll cut it short. Maybe in future posts, I’ll cover ontology more closely to engineering, and what you, as an architect, computer scientist, programmer, etc, can do to make your work  a much pleasant and rewarding one. My very good friend Leonardo Lezcano, has published many works in the healthcare domain ontology, with research and papers covering the Semantic Web and Semantic Interoperability. You can find some of his works HERE and HERE.

This is somehow a challenging topic to explain, and for the recipient to say “I get it” the first time around. I’ll feel good if I get a “I kinda got it” after someone reading this :)