On Version Control Architectures and the Fear of Displacing Innovation
August 19th, 2007
This morning, I stumbled across Linus Torvalds’ talk on git, the distributed version control system that he created for the linux kernel and I watched it, mostly out of curiosity, because I never heard Linus speak and because I was somewhat curious to hear him speak about his notoriously heretic views on version control.
His bashing of existing, centralized version control systems is so in your face it’s actually funny and not at all insulting, but I found myself curiously defensive while hearing him bashing the CVS/SVN model of a centralized repository.
I started using CVS in 1997 when I first committed a change to the Apache JServ project. I sent over the java class that I modified attached to an email message. The JServ developers replied “send a patch instead” and I replied “what’s that?”. I was 22 and I was (back then) a windows user. But in order to see my patched back in JServ (and avoid having to maintain my own personal branch over time), I had to find, install and learn things like diff, patch and, ultimately, cvs.
Let me tell you: I hated it. With a passion. I hated any command line thing, really, so I found myself WinCVS and used that… but the whole process felt awkward and counterintuitive.
Fast forward 10 years to the present, thousands of commits later on tens of different projects (0pen source and not) and a transition from CVS to its repainted and remodeled cousin Subversion, and I find myself defensive hearing somebody expressing the exact same feelings that I had 10 years ago.
To be honest, it’s hard to separate my dislike for CVS from the feeling of it being complicated and unnecessary for somebody that always worked alone on his software or with a couple of close friends.
But as I watched Linus’ talk, my defensiveness slowly morphed into fear, the fear of defending the use of a particular technology not for its merits, but just because I’m used to it and I’m comfortable in its own quirks and limitations. The brain equivalent of muscle memory, so to speak.
I’ve always thought of myself as better and more open minded than that, but what if I’m just as bad? or what if I’m growing more lazy and rigid as I get older?
That fear, if anything, made me try git. I downloaded the source code, build it and ran it. I didn’t even read the manual, I just launched it and I was already impressed: one of the supported commands is “bisect” and the help says “Find the change that introduced a bug by binary search”. I have no idea (yet) on how it works, but I had wanted that feature, somewhere, somehow, for years and I even thought of implementing it in Gump3 (but never got around to do it). That’s enough to get me try it for real in the near future.
Linus’ main idea in favor of distributed version control is that, as humans, we are naturally social beings and we deal with trust and respect in a social way: one way is to create a gate and select who comes in and out (the centralized, walled garden, grant-commit-access way of doing things), another is to let everybody have their own personal branch, where nobody is fixed in the center and anybody can pull from anybody else in the network (the center becomes really a centroid, the center of gravity of the network and can move more easily over time).
I had dismissed such ideas as heretic in the past because I thought that centralization was the only mean that could maintain brand control and provide a consistent legal framework upon which to operate.
But I know think I was just being scared of seeing the pillars of the social structure of the projects I was involved in crumble under the pressure of a displacing innovation.
I listened to Linus today indicate how branching is vital, centralization is evil and innovation should not be walled nor restricted and trust is a social graph percolation problem… these are all things I truly believed in myself and that I actively evangelized in other part of my life and career…. but I never realized that I was using and I had choose to use for my own personal projects and my day job something that, in truth, is designed around an old, centralized, vertical and monolithic vision.
Linus claims that he won’t directly trust more than 15 developers. I’ve been there myself, as technical leader of a big open source project and I completely agree with him: the development core cannot possibly be larger than that because the coordination costs alone will make it impractical to do any work at all. If let free to organize themselves, such flat development networks tend, naturally, to organize hierarchically, if only to isolate the broadcasting range and optimize the execution of tasks. So, if Linus trusts 15 people and each one of them trusts 15 people, you don’t need a lot of layers to cover a huge amount of developers working together on a particular project.
Linus also claims that without walled gardens and commit access, there is much less need for politics around granting commit access. Well, truth is, he’s the undiscussed dictator and his branch is, in fact, the central branch and he’s the one deciding what gets in and what doesn’t. That, I believe, is what reduces politics not any version control architecture.
At the same time, it is true that even oligarchic projects tend to get less and less prone to give away “commit access like candy” (as some apache people tend to criticize my suggested practices of having a very low barrier of entry for committership) the older and bigger the project (and the social network) becomes.
Fact is, Linux is a project that has never experienced a leadership transition and according to my own rule of thumb for long-term community health, Linux scores pretty low because of that.
At the same time, if Linus was to get hit by a bus tomorrow, somebody else (obviously one of those 15 that he trusts directly) would step in and become the new social centroid (note: centroid != center).
Apache is proud of not having an overall ‘leader’, but, let’s face it: there might not be anybody that is “Mr. Apache” but every project always has a community centroid… it might not be the same person at all times like for the Linux kernel, but it actually helps to have one. I have been one in many different projects and I felt that the community liked the concept of a benevolent and well respected ruler that would step and help in those rare cases where the community couldn’t self-organize consensus around an issue. I’ve always felt the need to find my successor before stepping down and moving on to something else to avoid letting the community destabilize itself during the search for that person (which I think, in most projects, it’s the PMC chair)
I don’t see Apache changing version control system anytime soon (if only, ehm, because several of its board members are also subversion developers), but that’s not really my immediate concern: what I want to understand is whether Linus’ implicit claims that centralized and branch-phobic version control systems are suboptimal to the gathering of user innovation is true or not.
Because, truth is, I’ve never actually expressed that myself, but watching his talk made me crystallize that thought in my head: how many potential contributors did we miss because we didn’t give them commit access soon enough or because they didn’t feel comfortable revealing their patches back to us? How much less work could I have done if we had used a different version control system? How much more innovative could our projects be if we didn’t use a centralized design and a walled garden?
In his infamous “Linux is obsolete” debate what Tanembaum didn’t understand was that Linus’ secret sauce was not going to be technological (which he was right to criticize as Linux wasn’t exactly shining architecturally) but social (which didn’t even enter his radar).
I truly wonder if in sticking with the consolidate model for version control, we’re not making the same mistake.