The MIT Lectures Search Engine
September 15th, 2006
One of the drawbacks of working for MIT is that your “wow factor” gets spoiled, meaning that it’s really hard to make me go “wow” when I see something new around computers and information technologies since I’ve been immersed in so much cool stuff and cool people.
A few months ago, David pointed me to an MIT internal prototype of a web-based search engine for video lectures that he had been collaborating with for the UI part and that made my “wow meter” go off scale. He had to pressure me to avoid blogging about it back then because they were still working on the system and they didn’t want to deal with the side effects of early exposure, but today I received the ‘ok, go’ and I’m hitting the web press.
So, here it is: the MIT video lectures search engine.
NOTE: To enjoy the full experience, you need to have Real Player installed. I know, it’s a bummer, but believe me, it’s worth it.
I remember seeing something like this (a video search engine for the german parliament) at the Fraunhofer Institute in Germany a few years ago, but they were, ehm, ‘cheating’ by correcting the speech recognizers with human-generated trascripts and they needed a special client to access the database. This one is completely automated. No human intervention. And, best of all, works with your browser, has a simple google-like textbox, shows you the contexts around what you searched for in a clickable timeline view and does a karaoke-like word highlighting when the part of the video you selected is playing.
So, in other words, you feed hundreds of hours of video to a computer and, hours of crunching later, you get this.
But what’s really important to understand is not just how much more useful and more exposed the hundreds of MIT video lectures now is with a service like this (and how much professors will want to appear there too if they aren’t there already!), but also, as a non-native english speaker, I can only begin to imagine how useful this is as a spoken-english training platform for millions of students around the world! MIT is widely known for projects such as OpenCourseWare and One Laptop per Child, but this lecture search engine goes right up there for usefulness for humanity and not only as a kick-ass demo of decades of speech recognition research.
This paper explains part of the system (they are still in the process of submitting papers for this system)
I don’t have to be reminded to feel proud about working for MIT, but there is something deeply touching my engineering soul in seeing decades of research finally condensing in something that truly delivers the promise in a easy to use, easy to understand (and delightfully addictive, I might add) way.