« August 2009 | Main | October 2009 »

September 30, 2009

First Mover Effects

Reputation Wednesday is an ongoing series of essays about reputation-related matters. This week's essay is concerned with important downstream effects that can arise from the first tentative days & weeks of a community's formation. It is excerpted from Chapter 4: Building Blocks and Reputation Tips.

When an application handles quantitative measures based on user input, whether it's ratings or measuring participation by counting the number of contributions to a site, several issues arise-all resulting from bootstrapping of communities-that we group together under the term first-mover effects.

Early Behavior Modeling and Early-Ratings Bias

The first people to contribute to a site have a disproportionate effect on the character and future contributions of others. After all, this is social media, and people usually try to fit into any new environment. For example, if the tone of comments is negative, new contributors will also tend to be negative, which will also lead to bias in any user-generated ratings. See Ratings Bias Effects.

When an operator introduces user-generated content and associated reputation systems, it is important to take explicit steps to model behavior for the earliest users in order to set the pattern for those who follow.

Discouraging New Contributors

Take special care with systems that contain leaderboards when they're used either for content or for users. Items displayed on leaderboards tend to stay on the leaderboards, because the more people who see those items and click, rate, and comment on them, the more who will follow suit, creating a self-sustaining feedback loop.

This loop not only keeps newer items and users from breaking into the leaderboards, it discourages new users from even making the effort to participate by giving the impression that they are too late to influence the result in any significant way. Though this phenomenon applies to all reputation scores, even for digital cameras, it's particularly acute in the case of simple point-based karma systems, which give active users ever more points for activity so that leaders, over years of feverish activity, amass millions of points, making it mathematically impossible for new users to ever catch up.

September 23, 2009

Party Crashers (or 'Who invited these clowns?')

Reputation Wednesday is an ongoing series of essays about reputation-related matters. This week, we look at some of the possible effects when unanticipated guests enter into your carefully-planned and modeled system. This essay is excerpted from Chapter 5.

Reputation can be a successful motivation for users to contribute large volumes of content and/or high-quality content to your application. At the very least, reputation can provide critical money-saving value to your customer care department by allowing users to prioritize the bad content for attention and likewise flag power users and content to be featured.

But mechanical reputation systems, of necessity, are always subject to unwanted or unanticipated manipulation: they are only algorithms, after all. They cannot account for the many, sometimes conflicting, motivations for users' behavior on a site. One of the strongest motivations of users who invade reputation systems is commercial. Spam invaded email. Marketing firms invade movie review and social media sites. And drop-shippers are omnipresent on eBay.

EBay drop-shippers put the middleman back into the online market: they are people who resell items that they don't even own. It works roughly like this:

  1. A seller develops a good reputation, gaining a seller feedback karma of at least 25 for selling items that she personally owns.
  2. The seller buys some drop-shipping software, which helps locate items for sale on eBay and elsewhere cheaply, or joins an online drop-shipping service that has the software and presents the items in a web interface.
  3. The seller finds cheap items to sell and lists them on eBay for a higher price than they're available for in stores but lower than other eBay sellers are selling them for. The seller includes an average or above-average shipping and handling charge.
  4. The seller sells an item to a buyer, receives payment, and sends an order for the item, along with a drop-shipping payment, to the drop-shipper (D), who then delivers the item to the buyer.

This model of doing business was not anticipated by the eBay seller feedback karma model, which only includes buyers and sellers as reputation entities. Drop-shippers are a third party in what was assumed to be a two-party transaction, and they cause the reputation model to break in various ways:

  • The original shippers sometimes fail to deliver the goods as promised to the buyer. The buyer then gets mad and leaves negative feedback: the dreaded red star. That would be fine, but it is the seller-who never saw or handled the good-that receives the mark of shame, not the actual shipping party.
  • This arrangement is a big problem for the seller, who cannot afford the negative feedback if she plans to continue selling on eBay.
  • The typical options for rectifying a bungled transaction won't work in a drop-shipper transaction: it is useless for the buyer to return the defective goods to the seller. (They never originated from the seller anyway.) Trying to unwind the shipment (the buyer returns the item to the seller; the seller returns it to the drop-shipper-if that is even possible; the drop-shipper buys or waits for a replacement item and finally ships it) would take too long for the buyer, who expects immediate recompense.

In effect, the seller can't make the order right with the customer without refunding the purchase price in a timely manner. This puts them out-of-pocket for the price of the goods along with the hassle of trying to recover the money from the drop-shipper.

But a simple refund alone sometimes isn't enough for the buyer! No, depending on the amount of perceived hassle and effort this transaction has cost them, they are still likely to rate the transaction negatively overall. (And rightfully so – once it's become evident that a seller is working through a drop-shipper, many of their excuses and delays start to ring very hollow.) So a seller may have, at this point, outlayed a lot of their own time and money to rectify a bad transaction only to still suffer the penalties of a red star.

What option does the seller have left to maintain their positive reputation? You guessed it – a payoff. Not only will a concerned seller eat the price of the goods – and any shipping involved – but they will also pay an additional cash bounty (typically up to $20.00) to get buyers to flip a red star to green.

What is the cost of clearing negative feedback on drop-shipped goods? The cost of the item + $20.00 + lost time in negotiating with the buyer. That's the cost that reputation imposes on drop-shipping on eBay.

The lesson here is that a reputation model will be reinterpreted by users as they find new ways to use your site. Site operators need to keep a wary eye on the specific behavior patterns they see emerging and adapt accordingly. Chapter 10 provides more detail and specific recommendations for prospective reputation modelers.

September 16, 2009

Yahoo! Answers Community Moderation

Reputation Wednesday is an ongoing series of essays about reputation-related matters. This week, we pause to highlight a great presentation from Micah Alpern of Yahoo!

Micah Alpern is Director of User Experience for Social Search at Yahoo! and was, at one time, the lead User Experience designer for the first several iterations of Yahoo! Answers. One of the final projects Micah worked on for Answers was a reputation-intensive program to reduce the amount of abusive content that was appearing on that site.

We'll be covering this very project, Yahoo! Answers Community Moderation as an in-depth case study in our soon-to-be-drafted Chapter 12 and Micah recently gave a fantastic presentation to the Wikimania 2009 conference. It covers everything from business goals, community metrics, design and implementation to some insight into how well the project performed, and continues to perform.

If you just can't get enough, we'd also recommend you check out the video of Micah's presentation. Thanks for the presentation, Micah! (And you'll be seeing Micah, Ori Zaltzman, Yvonne French and other key drivers of that project surface in Chapter 12.)

September 14, 2009

Chapter 10: Application Integration, Testing & Tuning is up

We're proud to announce that Chapter 10: Application Integration, Testing & Tuning is now drafted and ready for those of you hearty enough to consume raw, unedited content..

Table of Contents:
  • We're starting to come into the home stretch with only two chapters remaining unwritten and the formal review process ramping up. It's getting pretty exciting.

    September 09, 2009

    Time Decay in Reputation Systems

    Reputation Wednesday is an ongoing series of essays about reputation-related matters. This week's essay is excerpted from Chapter 4: Building Blocks and Reputation Tips.

    Time leeches value from reputation: the section called “First Mover Effects” discussed how simple reputation systems grant early contributions are disproportionately valued over time, but there's also the simple problem that ratings become stale over time as their target reputable entities change or become unfashionable - businesses change ownership, technology becomes obsolete, cultural mores shift.

    The key insight to dealing with this problem is to remember the expression “What did you do for me this week?” When you're considering how your reputation system will display reputation and use it indirectly to modify the experience of users, remember to account for time value. A common method for compensating for time in reputation values is to apply a decay function: subtract value from the older reputations as time goes on, at a rate that is appropriate to the context. For example, digital camera ratings for resolution should probably lose half their weight every year, whereas restaurant reviews should only lose 10% of their value in the same interval.

    Here are some specific algorithms for decaying a reputation score over time:

    • Linear Aggregate Decay
      • Every score in the corpus is decreased by a fixed percentage per unit time elapsed, whenever it is recalculated. This is high performance, but scarcely updated reputations will have dispassionately high values. To compensate, a timer input can perform the decay process at regular intervals.
    • Dynamic Decay Recalculation
      • Every time a score is added to the aggregate, recalculate the value of every contributing score. This method provides a smoother curve, but it tends to become computationally expensive O(n2) over time.
    • Window-based Decay Recalculation
      • The Yahoo! Spammer IP reputation system has used a time window based decay calculation: fixed time or a fixed-size window of previous contributing claim values is kept with the reputation for dynamic recalculation when needed. New values push old values out of the window, and the aggregate reputation is recalculated from those that remain. This method produces a score with the most recent information available, but the information for low-liquidity aggregates may still be old.
    • Time-limited Recalculation
      • This is the de facto method that most engineers use to present any information in an application: use all of the ratings in a time range from the database and compute the score just in time. This is the most costly method, because it involves always hitting the database to consider an aggregate reputation (say, for a ranked list of hotels), when 99% of the time the value is exactly the same as it was the last time it was calculated. This method also may throw away still contextually valid reputation. We recommend trying some of the higher-performance suggestions above.

    September 02, 2009

    Chapters 3 (Architecture) & 9 (Uses) Are Up

    Reputation Wednesday is an ongoing series of essays about reputation-related matters. This week's entry introduces two new chapters in our book.

    The wiki for Building Web 2.0 Reputation systems has been updated with two new chapters that are very different from each other.

    Chapter 3 - The Reputation Sandbox is a fairly technical discussion of the execution environment for reputation models and establishing the product requirements to construct just such a sandbox. If that last sentence didn't make any sense to you, this technically oriented chapter can safely be skipped. Perhaps you will like something from the next one...

    Chapter 9 - Using Reputation: The Good, The Bad and the Ugly presents a whole host of reputation-driven strategies for improving content quality (and the perception of said) on your community-driven site. This chapter will interest UX designers, product- and community-managers and social architects of all stripes.

    No excerpts this time around folks. For their respective audiences, we think the chapters themselves are packed with chewy goodness.