Over the past week, there has been some blog talk (Fred Wilson, TechCrunch, David Porter) comparing music-recommendation services Pandora and Last.fm. I’ve been using both for the past couple months, making notes along the way. The idea was that I’d eventually have something to say. That might as well be now.
Both services allow you to specify a favorite artist, based on which you immediately receive an Internet audio stream of similar music. When I tell people this is possible—that you can have a personalized streaming radio station—most are astonished. So let’s start by saying that what these and similar services do is cool. How Pandora and Last.fm do it is an interesting compare-and-contrast.
Nature versus Nurture
Algorithmically, Pandora versus Last.fm is something like the nature versus nurture debate. Taking the nature side, Pandora’s recommendations are based on the inherent qualities of the music. Give Pandora an artist or song, and it will find similar music in terms of melody, harmony, lyrics, orchestration, vocal character and so on. Pandora likes to call these musical attributes “genes” and its database of songs, classified against hundreds of such attributes, the “Music Genome Project.”
On the nurture side (as in, it’s all about the people around you), Last.fm is a social recommender. It knows little about songs’ inherent qualities. It just assumes that if you and a group of other people enjoy many of the same artists, you will probably enjoy other artists popular with that group.
Like Last.fm, most music-discovery systems have been social recommenders, also known as collaborative filters. Although much of the academic work in the area has focused on improving the matching algorithms, Last.fm’s innovation has been in improving the data the algorithms work on. Last.fm does so by providing users an optional plug-in that automatically monitors your media-player software so whatever you listen to—whether it came from Last.fm or not—can be incorporated into your Last.fm profile and thus be used as the basis for recommendations. Compared to relying on users to manually provide preferences, this automatic and comprehensive data capture leads to far better grist for the data mill.
A side note: In my years of analytics and data mining, a recurring theme is that better algorithms are nice but better data is nicer. That’s because a large number of smart people have evolved the best data-mining algorithms for various scenarios; thus, further improvements tend to be incremental. By contrast, whatever data you happen to be using in a project has probably had no priming for analytical use. Thus, improving how you acquire, clean, and transform that data can have disproportionately large benefits. The catchphrase for the negative version of this is “garbage in, garbage out,” although one could just as easily say, “the more signal in, the more signal out.”
Surfacing New Artists
Pandora and Last.fm are both about helping people discover new music, so let’s consider their approaches in terms of discovering truly “new” music—that is, artists who are just appearing on the music scene. If we assume that both services put new artists into their database at the same rate, Last.fm will be slower in surfacing them as recommendations. This is due to the “cold start” problem that afflicts social recommenders: Before something new can become recommendable, it needs time to accumulate enough popularity to rise above the system’s noise level. In contrast, because Pandora is only comparing songs’ inherent qualities—not who they’re popular with—it should be able to recommend a new artist the first day that artist is in the system. That said, I wouldn’t be surprised if Pandora did a little biasing of recommendations by popularity, which it measures as people use the service.
Partisans of Last.fm might retort that, in practice, Pandora will be slower at getting new artists and music into its database because of Pandora’s classification bottleneck—that is, the time necessary for a Pandora employee to classify each song on hundreds of musical attributes. With that bottleneck, Pandora can’t just classify everything as it comes in the door. By contrast, Last.fm does not need to do manual classification. With its software plug-in continually updating people’s preferences, Last.fm has a virtual army of talent scouts constantly finding new things, which Last.fm can integrate into its database automatically.
(Leaky) Locked Loops
Pandora people might counter that Last.fm’s army of talent scouts is compromised by its relative uniformity. That is, a social recommender tends to reward people who are like those who already use the system. If there are already many people in Last.fm with similar tastes to you, you’ll get good recommendations; if not, then maybe not. And if you don’t get good recommendations, are you going to keep feeding the system data? Probably not, and thus we have a self-perpetuating in-group/out-group situation. The result is a “locked loop,” whereby a social recommender gets stuck in certain genres and styles.
But with a social music recommender, a truly locked loop is unlikely. The reason is “leakage”: A population that shares the same core musical tastes will have enough variance in secondary tastes to allow for a continually expanding spectrum, albeit with much slower expansion in certain genres than others. Here’s an example of the problem. When I checked Last.fm’s similar artists to the reggae legend Bob Marley, first on the list was James Brown, followed by The Chemical Brothers, then Aerosmith. (If you’re reading this well after January 30, 2006, beware that Last.fm’s system is continually evolving, so the lists these links point to will probably have changed.) Other reggae acts appear further down, but the unlikely top choices suggest that Marley has been brought into the system more as a distant secondary choice than as a primary choice with other acts in his genre. A quick check of Aerosmith’s similar artists confirms this: Marley is 41st on the list, way behind various likelier suspects.
While better non-reggae recommendations are easy to imagine for Marley, they probably won’t appear until Marley’s primary fans are better represented on Last.fm. Then the quality non-reggae choices can emerge from his core fans’ secondary choices.
For the sake of comparison, when I put Marley into Pandora, I got something like a reggae radio station at first, which then drifted into other stuff over time.
Why versus What
Pandora is less subject to the echo chamber of overly like minds, but it has its own fundamental challenge in its reliance on matching songs’ “genes.” This rules out connections between songs or artists that don’t fit Pandora’s modeling and matching of musical qualities—which, in turn, puts enormous pressure on Pandora’s specific approach to be correct. In other words, Pandora’s success hinges on a theory, and a specific implementation of that theory, about why music recommendations work. By contrast, Last.fm simply describes what goes together according to its audience and then makes relatively simple inferences from that. So if there are hidden factors that Pandora isn’t explicitly capturing, Last.fm is at least capturing them indirectly.
It’s not hard to find cases where Pandora’s approach runs aground, although the system’s lack of transparency makes it difficult to know where the problem lies. For example, it’s hard to explain Pandora’s initial choices for Gary Numan (he of “Cars” fame). With Numan as the seed, Pandora gave me syrupy pop tunes by Orchestral Maneuvers in the Dark and the Human League. Yes, each artist’s most famous material was from the same time and was primarily electronic, but the latter two really miss the Numan aesthetic, which is more like supercooled liquid metal than warm syrup. Pandora went on to do somewhat better, but not great, with subsequent tunes.
In comparison, Last.fm immediately delivered Numan-appropriate songs from Assemblage 23, Killing Joke, Kraftwerk, and Skinny Puppy, eventually drifting into less relevant territory. Still, Pandora partially redeemed itself with an inspired connection: “Out of Control” by Ric Ocasek (former leader of The Cars), an obscure cut from an artist that is far from obvious as a connection for Gary Numan.
Last.fm’s Delivery versus Pandora’s Promise
I raise the Numan example because it exemplifies my experiences with Last.fm and Pandora. Having used a wide range of artists as seeds, I found Last.fm better than Pandora at delivering songs that I liked or at least didn’t feel compelled to skip, which is the most important thing when I’m listening while doing something else. The exception was when the seed artist had not hit critical mass in the Last.fm system, per the Marley example. Meanwhile, Pandora had more misses but was more likely to surface something truly out of left field, as with the Ric Ocasek example.
As a result, both Pandora and Last.fm have maintained a place in my music-listening world. However, ultimately I think Pandora has greater promise because it is far easier for Pandora to incorporate Last.fm’s functionality than the other way around. This point is important because, just as with the nature versus nurture argument, the best answer is likely to involve elements of both camps. That said, Pandora’s advantage comes at a significant cost to its business, with all the manual work it entails. At this point, Pandora is not delivering proportionally more benefit for that cost—which is why I used the word “promise” above.
Pandora Possibilities
The key to Pandora’s changing the game is to take better advantage of its exclusive, hard-to-replicate metadata about music. Users may never be able to objectively judge the quality of recommendations among different services, but they can definitely tell the difference between services with unique ways of getting to recommendations. For example, I’d like to see Pandora expose some of its internal attributes as dials for the user to control. If I put in the singer Paul Westerberg (former leader of The Replacements), I’d like to tell the system to match more strongly along his lyrical style rather than by the fact he has a “gravely male voice” (which is one of the things Pandora said it was matching on). It’s easy to picture many other creative uses of Pandora’s metadata, both in terms of a recommender and other applications.
Finally, I wonder why Pandora continues to employ hundreds of attributes. In the world of modeling preferences, hundreds of variables typically can be consolidated down to a much smaller number with nearly the same predictive power. Typically, you start with a large number of variables as a kind of fishing expedition and then, over time, reduce the set down to those that are doing most of the work. The reduced set can be part of the original set and/or new variables derived specifically for predictive power. For a labor-intensive business like Pandora’s, being able to cut the number of variables in half (or a lot more) would help contain the costs. And if there’s good reason not to consolidate attributes, I would still be wondering how to innovate in streamlining the production process just as much as how to innovate in the customer-facing part of the business.
Bowling or Batting?
A final thought: What Last.fm and Pandora do is hard. The people who built these services deserve a lot of credit. Given the ambitious scope, it’s easy to find examples where each of the services comes up short. However, it’s worth considering what the yardstick should be. Should we expect spot-on recommendations like a pro bowler expects a strike every time? Or is this more like the baseball batter, who is happy to get a hit one in three times? Whatever the metaphor, the fact that these services do enough right to retain a substantial number of users is good news, because the features and quality will only get better. So when you try Last.fm and/or Pandora, be sure to give them enough time—and enough different starting points—to show their best stuff.