Drivers vs. Enablers
June 5th, 2010
I’ve heard many times people saying that the web exists because of “view source”.
“view source”, if you don’t know what I mean, is the ability that web browsers have to show you the source HTML content of the web page you are currently browsing. If you ask around, pretty much everybody that worked on the web early on will tell you that they learned HTML by example, by viewing the source or other people’s pages. Tricks and techniques were found by somebody, applied, and spread quickly.
There is wide and general consensus that ‘view source’ was a very instrumental tool to easily propagate knowledge and simplify adopting the web as a platform, yet its role is often confused.
“view source” was an enabler, a catalyst; something that makes it easier for a reaction or a process to take place and thus increases rate, effectiveness, adoption, or whatever metric you want to use.
But it is misleading to confuse “view source” for a driver: something that makes it beneficial and sustainable for the process to take place. The principal driver for the web was the ability for people to publish something to the entire world with dramatically reduced startup costs and virtually zero marginal costs. “view source” made it easier and reduced such startup costs, but had nothing to do with lowering marginal costs and certainly had very little to do with the intrinsic world-wide publishing features of the web.
You might think that the current HTML5 vs. Flash diatribe is what’s sparking these considerations, but it’s not: it’s something that Prof. David Karger wrote about my previous post (we deeply enjoy these blog-based conversations). He’s suggesting that while my approach of looking for sustainable models for open data contributions is good and worthwhile, he believes that a more effective strategy can be the one of convincing the tool builders to basically add a “view source” for data and that once that is in place, we wouldn’t have to care as the data would be revealed simply by people using the tools.
It’s easy to see the appeal for such a strategy: the coordination costs are greatly reduced as you have to talk and convince a much smaller population and all composed of people that already care about surfacing data and see potential benefits for further adoption of their toolsets.
On the other hand, if feels to me that it’s confusing enablers for drivers.
The order I pose questions in my mind when engineering adoption strategies is normally “why” then “how”: taking for granted that because you have drivers then everybody else must share it or have a similar one can easily lead you astray . The question of motive, of “what’s in for me?”, might feel materialistic, un-intellectual and limiting, but an understandable and predictable reward is the basis for behavioral sustainability.
David is basing his thoughts around Exhibit and I assume he considers the driver to be the tool itself and its usefulness: it can taking your data and presents it neatly and interactively without you having to do much work or bother your IT administrators to setup and maintain server-side software. That’s appealing, that’s valuable and that’s easy to explain.
The enabler for the network effect is that “cut/paste data” icon that people can click and obtain the underlying data representation of the model…. and do whatever they want with it.
But here is where things start to get interesting when you consider drivers and enablers separately: ‘view source’ was a great enabler for the web because it was useful for other people’s adoption but didn’t impact your own adoption drivers. The fact that others had access to the html code of your pages didn’t hurt you in any way…. mostly because the complexity of the system was locked on your end in your servers and your domain name is something you control and they can’t replicate. What you had access to was a thin surface of a much more complicated system running on somebody else’s servers. It was convenient to you and your developers to have that view-source and the fact that others benefited from it posed no threats to you.
This is dramatically different in the Exhibit situation (or in many other open data scenarios): not only you can take the data with you, but you can take the entire exhibit. Some people are not bothered by this fact, but you can assume that normal people get a weird feeling when they think that others can just take their entire work and run with it.
This need of ‘preventing people from benefitting from your work without you benefitting from theirs’ is precisely the leverage used by reciprocal copyright licenses (the GPL first, the CC-share-alike later) to promote themselves, but there is nothing in the Exhibit adoption model that addresses this issue explicitly.
If your business is to tell or synthesize stories emerged from piles of data (journalists, historians, researchers, politicians, teachers, curators, analysts, etc), we need to think about a contribution ecosystem where sharing your data benefits you and in a way that it’s obvious for you to understand (and to explain to your boss!). Or, as David suggests, a ‘view source’-style model where the individualistic driver is clear and obvious and the collaborative enabler is transparent, meaning that it doesn’t require them to do work and is not perceived as a threat to their individualistic driver.
The thing is: with Exhibit, or with any other system that makes the entire data available (this includes Freebase), the immediate perception that people have is that making their entire dataset available to others is clearly benefiting others and doesn’t seem to offer clear benefits for them (which was the central issue of my previous post).
Sure, you can try to guilt-trip them into releasing their data (cultural pressure) or use reciprocal licensing models (legal pressure), but really, the driver that works best is when people want to collaborate with one another (or are not bothered by others doing it on their own work) because they immediately perceive value in doing so.
Both Exhibit and Gridworks were designed with the explicit goal to be at first drivers for individual adoption (so that you have a social platform to work with) and potential enablers for collaborative action later (so that you can experiment with trying to build these network effects); but a critical condition for the collaborative enabler is that it must not reduce the benefit of individual adoption or otherwise it will reduce its ability to drive network effects.
Think for a second about a web where a ‘view source’ command in a browser pulled the entire codebase out of a website you’re visiting: do you really think it would have survived this long? remember how heated the debate was when the GPLv3 wanted to contain reciprocal constraints even for software that was just executed and not redistributed (which would have impacted all web sites and web services which are now exempt)?
It is incredibly valuable to be inspired by systems and strategies that worked in the past and by the dynamics that made them sustainable… but we must do so by appreciating both the similarities and the differences if we want to be successful in replicating their impact.
Counterintuitively, what might be required to bootstrap a more sustainable open data ecosystem is not more being more open but less, building tools that focus first on protecting individual investments, and then in fostering selective disclosure and collaboration over such disclosed part.
We sure can (and did) engineer systems that act as trojan horses for openness (Exhibit is one obvious example), but they have failed so far to create sustainable network effects because, I think, we have not yet identified the dynamics that entice stable and sustainable collaborative models around data sharing.