Home » Blog

The “Trail of Candy” Design Pattern

October 9th, 2009

In the delightfully titled book “Don’t shoot the dog“, behavioral biologist Karen Pryor tells about her experience training killer whales at Sea World. One thing that I realized while reading that book (suggested to me by Ben) was the surprising discovery of how much I inherently thought of training and teaching as a negative reinforcement process.

Let me explain: a negative reinforcement process is why, for example, cows stay in a field surrounded by an moderately electrified fence: they don’t know they should stay contained in one place (it’s not natural for them to do so), but when they try to get out they get a unpleasant experience by touching the fence. They try repeatedly, until they learn to associate the fence (or anything that looks like it) with the pain and avoid it. As a result, they stay contained.

The same thing happens when you yell at a kid or punish him, when we put people in jail for committing crimes, when we honk at somebody that cut us off in the car, etc.

One issue with a human training a killer whale is that negative reinforcement is not practical. There is little the human can do to harm or punish the killer whale without risking to seriously injure herself or the animal. It’s also not efficient: one of the most important principles in training (or brain plasticity in general) is that the more cause and effect are repeated and highly correlated in time, the faster and more durable the learning. When the only signal you have is pain or displeasure, there is a natural tendency for the subject to subtract from the training or to do whatever it can to maximize his well being by doing whatever can avoid the displeasure.

If kids are not paying attention in school, punishing them will not make them pay more attention, will just teach them to fake attention better because that’s the easiest thing they can do to achieve the result of stopping the punishing.

Negative vs. Positive Reinforcement

Unfortunately, it’s easier to hurt than to please. This is basically the reason why negative reinforcement is used a lot more than positive reinforcement even when it’s flat out inefficient and ineffective. This fact is so deeply ingrained into our everyday lives that we naturally tend to think of it first.

For example, people don’t throw trash in the trash can but leave it around the park? You can fine them or you can do this

which one do you think it will be more effective?

Jumping thru hoops… literally

Now suppose you want to convince a killer wale to jump out of the water and thru a hoop. You can’t just yell your orders and tell them to drop you 20 until they do it. What do you do?

First, you have to understand that jumping out of the water is fun and natural for many marine animals and they also love fish. So, one first thing to do might be to attach some fish to a pole and make them jump, higher and higher. At some point, you can introduce the hoop and keep the fish. Maybe you can try to keep the hoop and keep the pole but no fish.. you’ll give the reward after they landed in the water. You can also make a sound or light signal at the same time the animal does its action, so that it gets naturally merged in the training and associated with that behavior.

Ultimately, it doesn’t matter how, what matters is the thinking process that the trainer does to solve this fundamental behavioral puzzle: how do I break down this long path into a series of steps that can be rewarded individually and climbed incrementally?

The point is that there are many different styles of positive reinforcement and not all of them are equally effective. Suppose you want to convince somebody to hike up a hill. You can either tell them all about how beautiful the view is from the top, how easy the climb is, downplay the amount of time and effort it will take, underline how gratified they’ll feel after reaching the top, etc. Or you can divide the hike into small sections, each with a small but easily recognizable positive reward. If you’ve ever hiked with a child, you know which one of the two works.

Counterintuitively enough, it’s not the amount of the reward that matters most but it’s the perceived effort against the perceived potential benefit of making an action.

This means that placing a huge and unseen reward at a distance is less effective in shaping behavior than it is to fragment that reward into smaller pieces but each easily seen and their efforts/reward ratio easily understandable.

Which is to say it’s more effective to lay candy on a trail a few yards away than to place a cake every mile.

Trails of Candy vs. Trails of Cakes

Once you learn about the greater effectiveness of positive reinforcement and once you understand to look at those reinforcements and rewards with the eyes of those being reinforced (and against their natural tendencies) and not with yours, it is a fun and worthy exercise to reconsider certain user interaction designs and rethink them so that they feel more like trails of candies than trails of cakes.

One example of this is Excel vs. Access or, as I wrote previously, ‘data first’ vs. ‘structure first’.

Suppose you don’t know much about computers and you are given one with Microsoft Office installed in it. Then you’re given the task of saving some data that you need for your work. You can use any of the software tools in your computer, which one would you use?

Office has a very coherently recognizable set of icons and you’ve heard of it from other colleagues, so you try the tools one by one.

Word would work but its hard to lay things horizontally (you have to learn how to get a table, resize, etc.) and you quickly realize that. Powerpoint is even worse, doesn’t seem like the tool for the job. Excel already has a table and it’s as big as you want it. Access needs you to tell it how your data is going to look like after you’ve written it, but you don’t know that until you’re done, so that’s not going to work.

Word and Excel both feel like trails of candy, Excel’s candy is closer to you and you can see many more already, while on the Word path you see only one small candy close to you and then nothing else down the road and a steep hill. Powerpoint is a trail with no candy. Access is a trail with a huge cake but to one that doesn’t know or care or has time to explore trails at random looking for cakes, it just look like another trail with no candy.

What is a Candy?

A candy is anything that rewards the user and can be already recognizable by the user as something rewarding. It’s very important to realize that us knowing that something will reward the user is not enough, it has to be the user to know it will be rewarding to get there.

A piece of candy shaped like a rock won’t do, even if we know it’s made with the same stuff.

Also, a rock shaped like candy won’t do either, it might get the user to the first step but break her trust and force her to look at all the other visible candy as potential fakes and break the engagement.

Another interesting thing to realize is that even when users realize they are being ‘behaviorally shaped’ by this trail of candy, only few will consider this unfair or insulting enough to break the engagement. At the same time, it’s beneficial to state your intent at the beginning as to avoid surprises.

Unfortunately, knowing what a user wants, needs and can recognize is not an easy task: understanding that you have to break up the user experience in incrementally and constantly rewarding pieces might be a great starting point but it’s only the beginning of your journey.

The important thing to realize is that just like with killer whales, you are in no position to use negative reinforcement and also you can’t simply tell users what to do because mostly they won’t understand you because they don’t share the same terminology or the same level of attention or the same understanding of the potential reward.

The best user experiences are those who start as immediately rewarding to the individual user, continue to be constantly rewarding with their continued participation and generates a positive network effect with the participation of others (the web itself is a giant application of this but so are successful web sites like Amazon, eBay, Flickr, del.icio.us, Facebook etc).

The worse user experiences are those who start by telling people how wonderful and amazing their reward would be if they agree to painstakingly walk this trail for miles and keep telling them the reward is around the corner. The semantic web has always reminded me of that.

Permalink | Posted in Article

Freebase and IMDB data: a Correction

October 5th, 2009

While unscripted conversations are a wonderful way to achieve an engaging and fresh dialog, especially with a capable and intuitive host/instigator like Jon, they also present the danger of mis-characterizing certain aspects of the conversation itself.

It didn’t occur to me during my last interview, but several people inside Metaweb told me that I made it sound like Freebase loaded directly IMDB data while what I should have specified is that we loaded the IMDB ‘identifiers’ along with our movie data. This allows, for example, somebody that looks for a movie in Freebase to be able to look for the same movie in IMDB, but the data in Freebase and the data in IMDB come from different sources.

I apologize for not having been specific enough on the nature of what part of the IMDB data (just their items identifiers) we loaded into Freebase.

Permalink | Posted in Article

My Second Appearance on IT Conversations

September 26th, 2009

Jon Udell interviews me a second time for his Interviews with Innovators series over at IT Conversations.


Permalink | Posted in Article