Timeplot and Software that Gives you Questions, not Answers
December 16th, 2007
One of the things that surprised him (who has been reading this blog for years) was how little I write directly about the results of my research job here.
What’s funny is that I thought that was really the only thing I wrote about, but he’s right: what I talk about are the questions, not the answers.
In a sense, my research works by wandering around in a huge solution space, stumbling upon questions, then finding answers, test them in software, release it to the public and observe what happens. Creating the software and observing the public reaction uncovers other questions and the cycle repeats.
Jon was astonished (probably because that’s one of the things he does best) to realize that in that cycle, once we have a particular answer to a question implemented in software, we don’t tell it explicitly, we kinda let it out in the wild, announce it, and assume that people will figure it out on their own why we did what we did. And it’s true: sometimes the thoughts that brought us to make a particular decision are obvious to us but not at all to others.
I’ve always been a bottom-up grass-root type of software builder (and a hands off, “build it and they will come” kinda guy) but I do agree that the world is full of examples where that’s not enough and I could make better use of my resources and dedicate a little time explaining some of the software that we came up with in the last few years.
And what’s best to start if not by eating your own dog food?
You probably came across Timeline already (it’s used in many places) but Timeplot is less known, mostly because it doesn’t (yet) work on IE (which means, that if you’re reading this blog with IE, you should at least try to load it with firefox to get the full experience of the archives).
While Timeline is more focused on visualizing individual events and periods at different time scales (and does an outstanding and very innovative job at it!), Timeplot is focused specifically on uncovering causality dependency between events and trends. Also, since Timeplot’s code is built on top of Timeline’s, it uses the exact same XML data file for events.
Here is how the timeplot of my blog looks like in Firefox today:
The vertical gray lines are the blog posts, while the plots are trends color coded by type and plots on different scales (and without value references) to make you focus on trends rather than absolute values (which I find distracting for causality emergence).
The data can be ‘mashed up’ because both events and trends include time-based information and can therefore be correlated at that level. In the interactive version, you can mouse over a particular blog post and uncover which one it is: that is the first important aspect of Timeplot that is truly original, it allows to easily spot and correlate events and trends.
In the blog archive timeplot case, it’s somewhat obvious to assume that it’s the blog post to influence the trends and with that in mind, it’s intuitive to uncover ‘key posts’ that have a larger impact.
The most obvious to spot are the two recent blog posts that made relatively big traffic spikes are about color theory and Google’s Android Dalvik VM. The first was picked up by the del.icio.us crowd and made it to the ‘popular‘ page, the second was picked up by the blogosphere and made it into techmeme‘s first page.
Another interesting fact that the timeplot uncovered is that during the 6+ months in 2006 when I didn’t blog, my readership somewhat shrank (which is only natural, as people eventually prune their feed subscriptions or news readers alter their update frequency based on the blog’s update frequency itself) both in volume of feed requests and in number of unique subscribers (which here means ‘how many different IP addresses asked for the blog feed each day’). Let’s see if you can spot at least another 2 examples of this happening in the plot.
But the nice part about such timeplots is that they give you questions, not answers, they are great inspiration for further inspection and analysis.
For example, the plot shows a local minimum for readership around jul/aug 2007 which seems to imply that the post around that time helped changing the trend. The post in question is about procrastination. The interesting thing about that post is that, while it received pretty good feedback about it privately, there is no indication that it got any traction on the web (for example, del.icio.us has only 7 bookmarks on it), so where did these people come from? I still don’t have an answer for that.
Let me close by quoting Pablo Picasso:
Computers are useless. They can only give you answers.
I fully resonate with that mentality but I translate it about software: there is a class of programs designed to give you answers and another, unfortunately still much smaller, designed to give you both answers and further questions.
I want to focus mainly on enlarging this last one and Timeplot is an example of that I’m proud of.