Home » Blog » Darkness is relative, I guess

Darkness is relative, I guess

January 17th, 2007

I was rather surprised to read a blog post by Jim Hendler entitled “The Dark Side of the Semantic Web“. I was not surprised by the content (which already resonated with me), but rather by the combination author-content. Prof. Hendler has been one of the driving forces of DAML first and then of “AI side” of the Semantic Web later. One of the co-author (with TimBL and Ora Lassila) of the infamous Scientific American article that introduced the Semantic Web to the world.

In this post, he uses “dark side” as in “dark side of the moon”, not “dark side of the force”, to indicate “the other side of the semantic web that they can’t see from the place they live”. This place is “Planet AI”, the place where glaciations happen on a decade basis and tectonic plates move like icebergs. Surviving there for long is a real challenge and indicates serious foraging skills. It’s a planet where the natives are so inward looking that their own intelligence feels artificial to outsiders.

Which is why it is, to me, rather surprising to read:

It is the realization that the REST approach to the world is a wonderful way to use RDF and it is empowered by the emerging standards of SPARQL, GRDDL, RDF/A and the like. In short, it is the Semantic Web vision of Tim, before Ora and I polluted it with all this ontology stuff, coming real! And the good news for folks like me is that some little pieces of OWL turn out to be important to making this work (OWL Ultra Lite?) …

There are surprising statements in there:

  1. “all this ontology stuff” has polluted the original vision (TimBL’s) of the Semantic Web. Finally somebody said it out loud! Yes folks, hear hear: ‘ontology stuff pollutes the semweb’. I’m going to make t-shirts and print it. I might have found the business model for the semweb ;-)
  2. “some little pieces of OWL”. Yes, correct, only a few little piece of OWL are required. Mostly owl:sameAs, owl:equivalentProperty and subClass and subProperty from RDFSchema, that’s it.

It would be a wonderful gift to humanity if the “logicians” stopped trying to teach machines to think on my behalf with syllogistic symbols and just helped us creating operators for such symbols that are actually useful for real work.

I’ve stated it before but not in detail so let me restate: the idea that ‘reasoning’ must equal ‘description logics reasoning’ is not only completely bizarre to me but it gets funny too: the “Bible Wiki” is powered by Semantic MediaWiki, a wonderful MediaWiki plugin that allows people to add some simple markup syntax to their wikitext to indicate additional semantics and the system translates it into RDF. The nail on the DL coffin, for me, it’s the page about Cain, son of Adam. Scroll to the very bottom to read the RDF-ized genealogy of Cain: son of Adam and Eve and parent of Enoch.

The question I would love to ask a DL reasoner is not some tricky Godelian set-of-sets-like question but a rather simple: who is Enoch’s mother?

I remember asking that question in 8th grade during catholic Sunday school. The teacher’s answer was to kick me out of class to learn to respect the scripts (which didn’t work, btw). The Planet AI people’s answer would be to say that it’s impossible to model what the knowledge in the most printed book in the history of humanity because it’s inconsistent.

Don’t you feel there’s something wrong with this picture?

Sure sure, Captain Open World might be invoked to save the day, but this example shows a few things that are worth noting because I think they have been forgotten on Planet AI: contradictions and inconsistencies are an abundant and principal part of human knowledge, not a mistake that needs fixing.

Here it’s worth mentioning an inspiring quote from Prof. Carole Goble during her keynote at ISWC05 [words are mine as I didn’t take notes but just remembered the inspiration]: “99% of what humanity ever thought to be true turned out not to be”. You can sweep all this under the ‘open world’ assumption and find refuge in a socratic “I know I don’t know” state, but to me that’s just avoiding the issue: if we really want to model human knowledge, inconsistencies, contradictions and disagreements have to be first class citizens, not some damn temporary entropy we need to design our systems to deal with (or, worse, ignore).

So this is why I applaud Jim Hendler’s adaptation and his pointing out to fellow AI-ers that there might be something a lot more interesting happening if they just looked around.

The semantic web is really just data integration at a global scale. Some of this data might end up being consistent, detailed and small enough to perform symbolic reasoning on, but even if this is the case, that would be such a small, expensive and fragile island of knowledge that it would have the same impact on the world as calculus had on deciding to invade Iraq.

The biggest problem we face right now is a way to ‘link’ information that comes from different sources that can scale to hundreds of millions of statements (and hundreds of thousands of equivalences). Equivalences and subclasses are the only things that we have ever needed of OWL and RDFS, we want to ‘connect’ dots that otherwise would be unconnected. We want to suggest people to use whatever ontology pleases them and then think of just mapping it against existing ones later. This is easier to bootstrap than to force them to agree on a conceptualization before they even know how to start!

Personally, I’m betting hard on this “data first, mapping later” vs. “ontology first” approach, so this means that we must have software systems that are capable of coping with the computational complexities that this approach entails. It’s in this spirit that I welcome Prof. Hendler’s blog (and paper).