Home » Blog » Focusing on what works

Focusing on what works

January 2nd, 2004

Jon Udell gives a proposition for the new year: focus on what works. As much as I agree on this, I can’t stay quiet on what he writes before that:

It’s clear that that the future of the Unix-style pipeline lies with Web services

I’ve been one or the first to see the value of pipelines for xml processing and wrote Cocoon to make it happen, so I think I know a little about XML processing pipelines, but there is something that the people advocating web services miss entirely:

  • protocol interoperability is hard but can be achieved
  • data interoperability is harder but some standardization (real or de-facto) creates power-law clusters where things can work because some information can be taken for granted because implicit in that communication context

but there is a third piece of the puzzle that almost everybody forgets:

  • metadata interoperability

The webservice people hide all details under the schema carpet, but that’s where the big problem is: understanding what the heck your schema is about!

Why? because things evolve, change, adapt, move, break. If you want to design a system that works, you must take failure into consideration and I’m not talking about technical failure, I’m talking about the impact of the equivalent of “broken links” on web services and while the web has only links to break, imagine how many tiny contracts a web of web services create. And all those tiny contracts are shaky, noisy, and subject to social entropy.

The more web services you pipe together, the bigger the problem gets, and you are not simply adding the number of contracts, you are multiplying them!!! and when you pipeline web services that, potentially, include provide their service by pipelining other services, you go exponential!!

Web Services are tons of syntax/protocol sugar on top of something that was possible decades ago (with the introduction of computer networks), but never happened. Most people believe it didn’t happen because it was too hard to understand, to write, to make it work or to sell as a concept.

Marketing, protocol and syntax sugar aside, web services are RPC. Might be an understandable (for humans!) to do RPC, an easy way to write RPC clients, a easy sellable RPC concept with the billions thrown in by software corps, but then what? Ok, you call a function and you get something back. Now what? what if that function name changes overtime? what if the return value is still a number but it means a totally different thing? what if that web service is now considered obsolete? how do you tell?

Any decent software architects knows that modularity and polymorphism are great things, but comes with a price. Separation of concerns works only when you are either in control of both sides of the contract between the concerns, or you can trust the stability of that contract. If the contract changes unexpectedly, your entire modularity falls apart.

If you connect to an external data source and you can trust the solidity of the contract upon which you establish this communication, things are perfectly fine.Banks operate like this since the ’60s, the only problem they had was EDI so compared to that nightmare even dead-simple XML-RPC is a big win for them. Add SOAP and web services marketing crap so that it’s easy for you to sell and if that works for you, great. Some of the biggest cocoon installations were marketed like that.

But those who understand technology and still talk about an “network operating system” really need to get back to reality. And reality is that pipelines of web services will not work because they will be too fragile, or the system to make then solid will be so complex than it will be nothing else but the semantic web.

If you think that screen-scraping HTML was bad, well, why do you think that data-scraping a web service response would be any better? because you have a schema that tells you that the function “getQuote()” returns a String? Do you think that’s enough for you to write something on top of that? Oh yeah, complex schema types, add your own to the soup, be my guest.

You will start understanding what metadata is for.

And guess what, you’ll need a description framework for your web services to understand what’s going on. And then, since even descriptions change, you’ll need an ontology language to indicate things like equality between descriptions or operational equivalence.

So you find yourself with the same exact problems of the semantic web: semantic interoperability of metadata.

In those communication cases (SOAP-style, HTTP-style, TCP/IP socket style, snail mail, pigeon, whatever) where you can set that metadata implicit or consider it part of the shared communication context, great, that’ll work but such rigidity will severely limits its scalability in terms of how many players can take part of that communication.

But if you can’t do that or what to increase scalability by reducing rigidity, you’ll need to create an entire metadata framework that will look so similar to the entire RDF/OWL stack that you’ll cry tears of blood rethinking at how much you used to consider the semantic web a bloated return of AI BS.

So, at the end, I think that point-2-point web services will work, as they do today and have been doing for a long while. Which communication protocol and programming language they use to make it work doesn’t matter at all, it’s all just marketing.

The rest will have a complexity similar of that of RDF/RDFSchema/OWL stack (and if you look at BPEL4WS you start to understand what I’m talking about).

And if that will work or not is still to be understood, but drawing equalities is good: when you solve one problem, you solve the other and when you can’t solve one, you know you can’t solve the other.