Sunday, October 22, 2006

The Netflix Prize: Research Project as Product

Several people have asked what I think of the Netflix Prize, a $1 million contest to improve Netflix’s movie recommendations by 10%. For those expecting an “analyze the analytics” posting like Pandora vs. Last.fm, I’m going to throw you a curveball. I think the more interesting story here is about product marketing—and the Netflix Prize itself is the product.

Productizing a Research Project

From Netflix’s perspective, better recommendations mean higher profits. For those interested in the economics, Chris Anderson (author of The Long Tail) explains them.

But how do you make better recommendations? The usual approach would be to put some researchers on an internal project. Netflix had been doing that for years, but their researchers apparently hit the point of diminishing returns.

Then somebody had the idea of throwing open the problem to the rest of the world, saying something like, “There must be thousands of people with the skills, motivation, and computing hardware to tackle this problem. We just need them to work for us.”

There are indeed many experts in fields like statistical computing, machine learning, and artificial intelligence. There are even more dabblers who know just enough to be dangerous and could come up with answers the pros would never consider. The more people involved, the better the chance of success.

So from Netflix’s perspective, the problem evolved from creating a better algorithm to creating something, the Netflix Prize, that in turn would create Netflix a better algorithm. In essence, they built the Netflix Prize as a product: The “customers” were the prospective researchers; the challenge was to design and market something that would get these customers to participate.

Getting Attention: Eyes on the Prize

The $1 million prize is the most obvious feature. Having noticed the success (and now proliferation) of science-based prizes like the Ansari X Prize, Netflix no doubt liked the combination of free publicity such a prize generates along with the competitive dynamic that real money brings. The press and blogosphere were duly abuzz.

Making It Real: Heavy-Duty Data

Netflix offered up a huge, real-world data set of people’s movie ratings. This alone would have been enough to get lots of smart people playing with the data. Most aspiring data miners—who don’t happen to work at Netflix, Amazon.com, or other data-rich players—rarely if ever get a crack at data like this.

That said, Netflix slightly tainted this feature by “perturbing” an unspecified amount of the data “to prevent certain inferences being drawn about the Netflix customer base.” It’s not a big issue because a built-in limit exists to Netflix’s messing with the data: If the perturbed data ends up differing from the original data in important ways, Netflix could end up with a nightmare scenario where the winning algorithm exploits those differences and thus is not applicable to the original data. If that happened, Netflix would pay $1 million for an algorithm they can’t use on their actual data. As a result, we can safely assume the perturbed data is faithful to the original.

Talking Right: The Web Site

The Netflix Prize has its own Web site with a voice that is well tuned to its “customers,” the researcher types. The Rules and FAQ pages are not written in legalese, academic jargon, or various marketing dialects that no one speaks but that nevertheless appear in written form everywhere. The text is smart but informal, technical where necessary but not gratuitously so. To whomever wrote it, I salute you.

The Web site also includes a simple but effective leaderboard and community forum.

Giving Back: Winner Tells the World

Anticipating that most prospective researchers would immediately look for a catch—like what happens to the intellectual property you submit—Netflix summarizes the relevant terms in plain English: “To win...you must share your method with (and non-exclusively license it to) Netflix, and you must describe to the world how you did it and why it works.” I expected something far more dire. Besides adding a touch of idealism to the proceedings, the bit about telling the world talks to the likeliest suspects for contestants: academics or corporate researchers who have strong professional incentives to publish their work.

Selling the Goal: It’s Only 10%

“10% improvement” is a clever packaging of the goal, because it’s a lot harder than it sounds. According to the FAQ, Netflix’s own algorithm—the one you’re trying to beat by 10%—is only 10% better than “if you just predicted the average rating for each movie.” In other words, a naive approach works pretty well. And while there is still a significant amount of distance between Netflix’s algorithm and perfection, anything close to perfection is impossible because people are not consistent raters, neither among each other nor individually over time. Thus, a major unknown is how much headroom exists to do better before one hits the wall of rating noise. Yet it is known that achieving the first 10% over a naive approach was far from trivial.

The Results So Far

Three weeks into the competition, more than 10,000 contestants have registered. Twelve contestants have cleared the 1% improvement mark, seven have cleared 2%, three have cleared 3%, and two have cleared 4%. The current leader is at 4.67% improvement, almost half way to the $1 million prize.

Given that Netflix was ready to let the contest run for ten years, and included yearly “Progress Prizes” for contestants that could exceed the best score by 1%, I’d say the Netflix Prize has exceeded expectations so far. And that does not factor-in the positive public relations and consumer awareness that came with the various press hits.

If the progress continues at the current rate, the contest will be over at the three-month minimum that Netflix has set. However, extrapolating from the current pace is risky. Every additional point of improvement will be harder, and we don’t know where the practical limit is.

Why It’s Different

There have been various other data-mining competitions. I’ll hazard a guess that Netflix’s is the first to be covered as a feature story in The New York Times and will easily be the largest ever in term of participation. (The New York Times story is already behind the pay wall, but a syndicated version is available at News.com.)

The comparison with previous competitions is not fair, because other competitions were academic affairs, providing a little collegial competition at conferences. Yet Netflix’s success underlines how much more can be done when a data-mining competition becomes a means to do business.

By treating the Netflix Prize as a product, complete with features designed to maximize “customer” buy-in, Netflix created something far better than spending $1 million on its own researchers’ salaries over time. In that sense, the Netflix Prize is more interesting as a business method—spearheaded by spot-on product marketing—than a “Which algorithm will win?” story.

So I say to Netflix: Great idea, great execution. And to the contestants: May the best algorithm win.

1 comment:

  1. An update: One year later, the best result is 8.43% improvement, from a group at AT&T Labs. That group received a $50,000 progress prize, but the $1 million is still waiting to be claimed by whomever hits the 10% improvement mark.
    From the New York Times: "Yehuda Koren, the leader of the AT&T Labs team, based in Florham Park, New Jersey, said he and his team spent 2,000 hours performing data analysis and computation."

    ReplyDelete