« Ebay's Merchant Feedback System | Main | Reputation is Identity »

5-Star Failure?

Reputation Wednesday is an ongoing series of essays about reputation-related matters. This week's entry confirms that poorly chosen reputation inputs will indeed yield poor results.

Pity the poor, beleaguered 5-Star rating. Not so very long ago, it was the belle of the online ratings ball: its widespread adoption by high-profile sites like Amazon, Yahoo!, and Netflix influenced a host of imitators, and—at one point—star-ratings were practically an a priori choice for site designers when considering how best to capture their users' opinions. Their no-brainer inclusion had almost reached cargo cult design status.

This has subsided in recent years, as stars have received stiff competition from hot, upstart mechanisms like "Digg-style" voting (what we, when contributing to the Yahoo! Pattern Library, rechristened as Vote to Promote.) And Facebook's "Like" action (which, I guess, was ahem, "inspired by" FriendFeed though, let us not forget that for a time, also flirted with Thumbs Up & Down rating of feed items.) Definitely, within the past 2 or 3 years, stars 'obvious' appeal as the ratings mechanism of choice is no longer so obvious.

Even more recently, 5-Star ratings fall from grace is almost complete. YouTube fired the first volley, declaring that, by and large, people on YouTube overwhelmingly give 5 stars to videos on that site. (For readers of this site, you'll recall that we blogged about similar J-Curve distributions that are prevalent on Yahoo! as well.)

And then the venerable Wall Street Journal declared that On the Internet, Everyone's a Critic But They're Not Very Critical:

One of the Web's little secrets is that when consumers write online reviews, they tend to leave positive ratings: The average grade for things online is about 4.3 stars out of five.
And, just like that, as quickly as 'stars are it' rose to prominence, 'stars are dead' is rapidly becoming the accepted wisdom. (Don't believe me? Read the comments when TechCrunch covered the YouTube discovery, and you'll see folks all-but-rushing to prop up a variety of their 'preferred rating mechanism' in stars' place.)

Are stars dead?

This is, of course, the wrong way to frame the question. Stars, thumbs, favorites, or sliders: any of these ratings input mechanisms are dead-on-arrival if they're not carefully considered within the context of use. 5-Star ratings require a little more cognitive investment than a simple 'I Like This' statement, so--before designing 5-star ratings into your system--consider the following.

Will it be clear to users what you're asking them to assess? It's not entirely surprising that YouTube's ratings overwhelmingly tend toward the positive. That's a long-observed and well understood phenomenon in the social sciences called Acquiescence Bias. It is "the tendency of a respondent to agree with a statement when in doubt." And 5-star ratings, in the case of YouTube, are nothing but doubt. What, exactly, is a fair and accurate quantitative assessment for a video on YouTube? The input mechanism does provide some clues, in the form of text hints for the various ratings levels (ranging from 'Poor' to 'Awesome!') but these are highly subjective and - themselves - way too open to interpretation.

Is a scale necessary? If the primary decision you're asking users to make is 'good vs. bad' or 'I liked it' or 'I didn't', then are multiple steps of decisioning really adding anything to their evaluation?

Are comparisons being made? Should I, as a user, rate videos in comparison to other similar videos on YouTube? What, exactly, distinguishes a 5-star football to the groin video from a 2-star? Am I rating against like videos? Or all videos on YouTube? (Or every video I've ever seen!?)

Have they watched the video? One way to encourage more-thoughtful ratings is to place the input mechanism at the proper juncture: make some attempt, at least, to ensure that the user is rating the thing only after having experienced it. YouTube's 5-star mechanism is fixed and always-present, encouraging drive-by ratings, premature ratings or just general sloppiness of assessment.

So, are stars inappropriate for YouTube, at least in the way that they've designed them? Probably, yes.

To wrap up, some quick links. Check out this elegant and innovative design that the folks at Steepster recently rolled out, and think about the ways it cleverly addresses all four of the concerns listed above.

And to see a really in-depth study of 5-star ratings used effectively, check out Using 5-Star Ratings from Christopher Allen & Shannon Appelcline's excellent series on Systems for Collective Choice.


TrackBack URL for this entry: