Differences

This shows you the differences between two versions of the page.

chapter_10 [2009/11/18 18:22]
randy
chapter_10 [2009/12/01 14:53] (current)
randy submitted for publisher review
Line 1: Line 1:
===== Case Study: Yahoo! Answers Community Content Moderation ===== ===== Case Study: Yahoo! Answers Community Content Moderation =====
 +This chapter is a real-life case study applying nearly many of the theories and practical advice presented in this book. The lessons learned on this project had a significant impact on our thinking about reputation systems, the power of social media moderation, and the need to publish these results in order to share our findings with the greater web application development community.
 +
In summer 2007, Yahoo! tried to address some moderation challenges with one of its flagship community products: Yahoo! Answers ([[http://answers.yahoo.com|answers.yahoo.com]]). The service had fallen victim to its own success and drawn the attention of trolls and spammers in a big way. The Yahoo! Answers team was struggling to keep up with harmful, abusive content that flooded the service, most of which originated with a small number of bad actors on the site. In summer 2007, Yahoo! tried to address some moderation challenges with one of its flagship community products: Yahoo! Answers ([[http://answers.yahoo.com|answers.yahoo.com]]). The service had fallen victim to its own success and drawn the attention of trolls and spammers in a big way. The Yahoo! Answers team was struggling to keep up with harmful, abusive content that flooded the service, most of which originated with a small number of bad actors on the site.
Line 10: Line 12:
Yahoo! Answers provides a very simple interface to do, chiefly, two things: pose questions to a large community (potentially, any active, registered Yahoo! user-that's roughly a half-billion people worldwide); or answer questions that others have asked. Yahoo! Answers was modeled, in part, from similar question-and-answer sites like Korea's Naver.com Knowledge Search. Yahoo! Answers provides a very simple interface to do, chiefly, two things: pose questions to a large community (potentially, any active, registered Yahoo! user-that's roughly a half-billion people worldwide); or answer questions that others have asked. Yahoo! Answers was modeled, in part, from similar question-and-answer sites like Korea's Naver.com Knowledge Search.
-The appeal of this format was undeniable. By June of 2006, according to //Business 2.0//, Yahoo! Answers had already become "the second most popular Internet reference site after Wikipedia and had more than 90% of the domestic question-and-answer market share, as measured by comScore. "Its popularity continues and, owing partly to excellent search engine optimization, Yahoo! Answers pages frequently appear very near the top of search results pages on Google and Yahoo! for a wide variety of topics.+The appeal of this format was undeniable. By June of 2006, according to //Business 2.0//, Yahoo! Answers had already become “the second most popular Internet reference site after Wikipedia and had more than 90% of the domestic question-and-answer market share, as measured by comScore.Its popularity continues and, owing partly to excellent search engine optimization, Yahoo! Answers pages frequently appear very near the top of search results pages on Google and Yahoo! for a wide variety of topics.
Yahoo! Answers is by the most active community site on the Yahoo! network. It logs more than 1.2 million user contributions (questions and answers combined) each day. Yahoo! Answers is by the most active community site on the Yahoo! network. It logs more than 1.2 million user contributions (questions and answers combined) each day.
<html><a name="Figure_10-1"><center></html>// Figure_10-1: The questions asked and answers shared on Yahoo! Answers are often based on experiential knowledge rather than authoritative, fact-based information. //<html></center></a></html> <html><a name="Figure_10-1"><center></html>// Figure_10-1: The questions asked and answers shared on Yahoo! Answers are often based on experiential knowledge rather than authoritative, fact-based information. //<html></center></a></html>
-<html><center><img width="70%" src="http://buildingreputation.com/lib/exe/fetch.php?media=Ch12-ATypicalAnswersQuestion.png"/></center></html>+<html><center><img width="70%" src="http://buildingreputation.com/lib/exe/fetch.php?media=Figure_10-1.png"/></center></html>
Line 21: Line 23:
Yahoo! Answers is a unique kind of marketplace-one not based on the transfer of goods for monetary reward. No, Yahoo! Answers is a knowledge marketplace, where the currency of exchange is ideas. Furthermore, Yahoo! Answers focuses on a specific kind of knowledge. Yahoo! Answers is a unique kind of marketplace-one not based on the transfer of goods for monetary reward. No, Yahoo! Answers is a knowledge marketplace, where the currency of exchange is ideas. Furthermore, Yahoo! Answers focuses on a specific kind of knowledge.
-Micah Alpern was the user experience lead for early releases of Yahoo! Answers. He refers to the unique focus of Yahoo! Answers as “experiential knowledge” -the exchange of opinions and sharing of common experiences and advice. While verifiable, factual information is indeed exchanged on Yahoo! Answers, a lot of the conversations that take place there are intended to be social in nature.+Micah Alpern was the user experience lead for early releases of Yahoo! Answers. He refers to the unique focus of Yahoo! Answers as “experiential knowledge” -the exchange of opinions and sharing of common experiences and advice. (See <html><a href="#Figure_10-1">Figure_10-1</a>&nbsp;</html>) While verifiable, factual information is indeed exchanged on Yahoo! Answers, a lot of the conversations that take place there are intended to be social in nature.
<note tip>Micah has published a detailed presentation that covers this project in some depth. You can find it at [[http://www.slideshare.net/malpern/wikimania-2009-yahoo-answers-community-moderation|http://www.slideshare.net/malpern/wikimania-2009-yahoo-answers-community-moderation]]. <note tip>Micah has published a detailed presentation that covers this project in some depth. You can find it at [[http://www.slideshare.net/malpern/wikimania-2009-yahoo-answers-community-moderation|http://www.slideshare.net/malpern/wikimania-2009-yahoo-answers-community-moderation]].
</note> </note>
-Yahoo! Answers is not a reference site in the sense that Wikipedia is: it is not based on the ambition to provide objective, verifiable information. Rather, its goal is to encourage participation from a wide variety of contributors. That goal is important to keep in mind as we delve further into the problems that Yahoo! Answers was undergoing and the steps needed to solve them. Specifically, keep the following in mind:  * Answers on Yahoo! Answers are subjective. It is the community that determines what responses are ultimately "right." It should //not// be a goal of any metamoderation system to distinguish right answers from wrong or otherwise place any importance on the objective truth of answers.+Yahoo! Answers is not a reference site in the sense that Wikipedia is: it is not based on the ambition to provide objective, verifiable information. Rather, its goal is to encourage participation from a wide variety of contributors. That goal is important to keep in mind as we delve further into the problems that Yahoo! Answers was undergoing and the steps needed to solve them. Specifically, keep the following in mind:  * Answers on Yahoo! Answers are subjective. It is the community that determines what responses are ultimately “right.It should //not// be a goal of any meta-moderation system to distinguish right answers from wrong or otherwise place any importance on the objective truth of answers.
  * In a marketplace for opinions such as Yahoo! Answers, it's in the best interest of everyone (askers, answerers, and the site operator) to encourage //more// opinions, not fewer. So the designer of a moderation system intended to weed out abusive content should make every attempt to avoid punishing legitimate questions and answers. False positives can't be tolerated, and the system must include an appeals process.   * In a marketplace for opinions such as Yahoo! Answers, it's in the best interest of everyone (askers, answerers, and the site operator) to encourage //more// opinions, not fewer. So the designer of a moderation system intended to weed out abusive content should make every attempt to avoid punishing legitimate questions and answers. False positives can't be tolerated, and the system must include an appeals process.
Line 34: Line 36:
So, what problems, exactly, was Yahoo! Answers suffering from? Two factors-the timeliness with which Yahoo! Answers displayed new content, and the overwhelming number of contributions it received-had combined to create an unfortunate environment that was almost irresistible to trolls. Dealing with offensive and antagonistic user content had become the number one feature request from the Yahoo! Answers community. So, what problems, exactly, was Yahoo! Answers suffering from? Two factors-the timeliness with which Yahoo! Answers displayed new content, and the overwhelming number of contributions it received-had combined to create an unfortunate environment that was almost irresistible to trolls. Dealing with offensive and antagonistic user content had become the number one feature request from the Yahoo! Answers community.
-The Yahoo! Answers team first attempted a machine-learning approach, developing a black-box abuse classifier (lovingly named the "Junk Detector") to prefilter abuse reports coming in. It was intended to classify the worst of the worst content and put it into a prioritized queue for the attention of customer care agents.+The Yahoo! Answers team first attempted a machine-learning approach, developing a black-box abuse classifier (lovingly named the “Junk Detector” ) to prefilter abuse reports coming in. It was intended to classify the worst of the worst content and put it into a prioritized queue for the attention of customer care agents.
The Junk Detector was mostly a bust. It was moderately successful at detecting obvious spam, but it failed altogether to identify the subtler, more insidious contributions of trolls. The Junk Detector was mostly a bust. It was moderately successful at detecting obvious spam, but it failed altogether to identify the subtler, more insidious contributions of trolls.
-<note tip>+<box blue 75% round>
** Do Trolls Eat Spam? ** ** Do Trolls Eat Spam? **
Line 51: Line 53:
How do you detect for //that//? It's hard for //a// human-and near impossible for a machine-but it's possible with //a number// of humans. Adding consensus and reputation-enabled methods makes it easier to reliably discern trollish behavior from sincere contributions. Because a reputation system to some degree reflects the tastes of a community, it also has a better-than-average chance at catching behavior that transgresses those tastes. How do you detect for //that//? It's hard for //a// human-and near impossible for a machine-but it's possible with //a number// of humans. Adding consensus and reputation-enabled methods makes it easier to reliably discern trollish behavior from sincere contributions. Because a reputation system to some degree reflects the tastes of a community, it also has a better-than-average chance at catching behavior that transgresses those tastes.
-</note+</box
-Engineering manager Ori Zaltzman recalls the exact moment he knew for certain that something had to be done about trolls-when he logged onto Yahoo! Answers to see the following question highlighted on the home page: "What is the best sauce to eat with my fried dead baby?" (And, yes, we apologize for the citation-but it certainly illustrates the distasteful effects of letting trolls go unchallenged in your community.)+Engineering manager Ori Zaltzman recalls the exact moment he knew for certain that something had to be done about trolls-when he logged onto Yahoo! Answers to see the following question highlighted on the home page: “What is the best sauce to eat with my fried dead baby?(And, yes, we apologize for the citation-but it certainly illustrates the distasteful effects of letting trolls go unchallenged in your community.)
That question got through the Junk Detector easily. Even though it's an obviously unwelcome contribution, on the surface-and to a machine-it looked like a perfectly legitimate question: grammatically well formed, no SHOUTING ALL CAPS. So abusive content could sit on the site with impunity for hours before the staff could respond to abuse reports. That question got through the Junk Detector easily. Even though it's an obviously unwelcome contribution, on the surface-and to a machine-it looked like a perfectly legitimate question: grammatically well formed, no SHOUTING ALL CAPS. So abusive content could sit on the site with impunity for hours before the staff could respond to abuse reports.
Line 58: Line 60:
== Time Was a Factor == == Time Was a Factor ==
-Because the currency of Yahoo! Answers is the free exchange of opinions, a critical component of "free" in this context is //timely//. Yahoo! Answers functions best as a near-real-time communication system, and-as a design principle-erred on the side of timely delivery of users' questions and answers. User contributions are not subject to any type of editorial approval before being pushed to the site.+Because the currency of Yahoo! Answers is the free exchange of opinions, a critical component of “free” in this context is //timely//. Yahoo! Answers functions best as a near-real-time communication system, and-as a design principle-erred on the side of timely delivery of users' questions and answers. User contributions are not subject to any type of editorial approval before being pushed to the site.
<note tip>Early on, the Yahoo! Answers product plan //did// call for editor approval of all questions before publishing. This was an early attempt to influence the content quality level through by modeling good user behavior. The almost immediate, skyrocketing popularity of the site quickly rendered that part of the plan moot. There simply was no way that any team of Yahoo! content moderators was going to keep up with the levels of use on Yahoo! Answers. <note tip>Early on, the Yahoo! Answers product plan //did// call for editor approval of all questions before publishing. This was an early attempt to influence the content quality level through by modeling good user behavior. The almost immediate, skyrocketing popularity of the site quickly rendered that part of the plan moot. There simply was no way that any team of Yahoo! content moderators was going to keep up with the levels of use on Yahoo! Answers.
Line 68: Line 70:
<html><a name="Figure_10-2"><center></html>// Figure_10-2: Because questions on Yahoo! Answers could appear on the front page of the site with no verification that the content was appropriate, spammers and trolls flocked to this high-value real estate. //<html></center></a></html> <html><a name="Figure_10-2"><center></html>// Figure_10-2: Because questions on Yahoo! Answers could appear on the front page of the site with no verification that the content was appropriate, spammers and trolls flocked to this high-value real estate. //<html></center></a></html>
-<html><center><img width="70%" src="http://buildingreputation.com/lib/exe/fetch.php?media=Ch12-AnswersMainModule.png"/></center></html>+<html><center><img width="70%" src="http://buildingreputation.com/lib/exe/fetch.php?media=Figure_10-2.png"/></center></html>
Any newly asked question could potentially appear in highly trafficked areas, including the following:  * The index of open (answerable) questions ([[http://answers.yahoo.com/dir/index|http://answers.yahoo.com/dir/index]]) Any newly asked question could potentially appear in highly trafficked areas, including the following:  * The index of open (answerable) questions ([[http://answers.yahoo.com/dir/index|http://answers.yahoo.com/dir/index]])
Line 79: Line 81:
Yahoo! Answers, somewhat famously, already featured a reputation system-a very visible one, designed to encourage and reward ever-greater levels of user participation. On Yahoo! Answers, user activity is rewarded with a detailed point system. (See <html><a href="/doku.php?id=Chapter_7#Chap_7-Points_and_Accumulators">Chap_7-Points_and_Accumulators</a>&nbsp;</html>.) Yahoo! Answers, somewhat famously, already featured a reputation system-a very visible one, designed to encourage and reward ever-greater levels of user participation. On Yahoo! Answers, user activity is rewarded with a detailed point system. (See <html><a href="/doku.php?id=Chapter_7#Chap_7-Points_and_Accumulators">Chap_7-Points_and_Accumulators</a>&nbsp;</html>.)
-<note tip>We say "famously" because the Yahoo! Answers point system is somewhat notorious in reputation system circles, and debate continues to rage over its effectiveness.+<note tip>We say “famously” because the Yahoo! Answers point system is somewhat notorious in reputation system circles, and debate continues to rage over its effectiveness.
-At the heart of the debate is this question: does the existence of these points-and the incentive of rewarding people for participation-actually improve the experience of using Yahoo! Answers? Does it make the site a better source of information? Or are the system's gamelike elements promoted too heavily, turning what could be a valuable, informative site into a game for the easily distracted?+At the heart of the debate is this question: does the existence of these points-and the incentive of rewarding people for participation-actually improve the experience of using Yahoo! Answers? Does it make the site a better source of information? Or are the system's game-like elements promoted too heavily, turning what could be a valuable, informative site into a game for the easily distracted?
We're mostly steering clear of that discussion here. (We touch on aspects of it in <html><a href="/doku.php?id=Chapter_7">Chapter_7</a>&nbsp;</html>.) This case study deals only with combating obviously abusive content, not with judging good content from bad. We're mostly steering clear of that discussion here. (We touch on aspects of it in <html><a href="/doku.php?id=Chapter_7">Chapter_7</a>&nbsp;</html>.) This case study deals only with combating obviously abusive content, not with judging good content from bad.
Line 116: Line 118:
The first motivation for cleaning up abuse on Yahoo! Answers was cost. The existing system for dealing with abuse was expensive, relying as it did on heavy human-operator intervention. Each and every report of abuse had to be verified by a human operator before action could be taken on it. The first motivation for cleaning up abuse on Yahoo! Answers was cost. The existing system for dealing with abuse was expensive, relying as it did on heavy human-operator intervention. Each and every report of abuse had to be verified by a human operator before action could be taken on it.
-Randy Farmer, at the time the community strategy analyst for Yahoo!, pointed out the financial foolhardiness of continuing down the path where the system was leading: "The cost of generating abuse is //zero//, while we're spending a million dollars a year on customer care to combat it-and it //isn't even working.//" Any new system would have to fight abuse at a cost that was orders of magnitude lower than that of the manual-intervention approach.+Randy Farmer, at the time the community strategy analyst for Yahoo!, pointed out the financial foolhardiness of continuing down the path where the system was leading: “the cost of generating abuse is //zero//, while we're spending a million dollars a year on customer care to combat it-and it //isn't even working.//Any new system would have to fight abuse at a cost that was orders of magnitude lower than that of the manual-intervention approach.
Line 148: Line 150:
  - Customer care could be removed from the loop - in most cases - by shifting the content removal process into the application and giving it to the users, who were already the source of the abuse reports, and then optimizing it to cut the amount of time and offensive posting is visible by 90%.   - Customer care could be removed from the loop - in most cases - by shifting the content removal process into the application and giving it to the users, who were already the source of the abuse reports, and then optimizing it to cut the amount of time and offensive posting is visible by 90%.
  - Customer care could then handle just the exceptions-undoing the removal of content mistakenly identified as abusive. At the time, such false positives made up 10% of all content removal. Even if the exception rate stayed the same, customer care costs would decrease by 90%.   - Customer care could then handle just the exceptions-undoing the removal of content mistakenly identified as abusive. At the time, such false positives made up 10% of all content removal. Even if the exception rate stayed the same, customer care costs would decrease by 90%.
-The team would accomplish item 1, removing customer care from the loop, by implementing a new way to remove content from the site-"hiding." Hiding involved trusting the community members themselves to vote to hide the abusive content. The reputation platform would manage the details of the voting mechanism and any related karma. Because this design required no external authority to remove abusive content from view, it was probably the fastest possible way to cut display time for abusive content.+The team would accomplish item 1, removing customer care from the loop, by implementing a new way to remove content from the site-“hiding.Hiding involved trusting the community members themselves to vote to hide the abusive content. The reputation platform would manage the details of the voting mechanism and any related karma. Because this design required no external authority to remove abusive content from view, it was probably the fastest possible way to cut display time for abusive content.
As for item 2, dealing with exceptions, the team devised an ingenious mechanism-an appeals process. In the new system, when the community voted to hide a user's content, the system sent the author an email explaining why, with an invitation to appeal the decision. Customer care would get involved only if the user appealed. The team predicted that this process would limit abuse of the ability to hide content; it would provide an opportunity to inform users about how to use the feature; and, because trolls often don't give valid email addresses when registering an account, they would simply be unable to appeal-they'd never receive the notices. As for item 2, dealing with exceptions, the team devised an ingenious mechanism-an appeals process. In the new system, when the community voted to hide a user's content, the system sent the author an email explaining why, with an invitation to appeal the decision. Customer care would get involved only if the user appealed. The team predicted that this process would limit abuse of the ability to hide content; it would provide an opportunity to inform users about how to use the feature; and, because trolls often don't give valid email addresses when registering an account, they would simply be unable to appeal-they'd never receive the notices.
-<html><a name="Figure_10-3"><center></html>// Figure_10-3: The system uses reputation as a basis for hiding abusive content, leaving staff to handle only appeals. //<html></center></a></html> +<html><a name="Figure_10-3"><center></html>// Figure_10-3: The system would use reputation as a basis for hiding abusive content, leaving staff to handle only appeals. //<html></center></a></html> 
-<html><center><img width="70%" src="http://buildingreputation.com/lib/exe/fetch.php?media=Ch12-YahooAnswersReputationModelOverview.png"/></center></html>+<html><center><img width="70%" src="http://buildingreputation.com/lib/exe/fetch.php?media=Figure_10-3.png"/></center></html>
Most of the rest of this chapter details the reputation model designated by just the Hide Content? diamond in <html><a href="#Figure_10-3">Figure_10-3</a>&nbsp;</html>. See the patent application for more details about the other (nonreputation) portions of the diagram, such as the Notify Author and Appeals process boxes. Most of the rest of this chapter details the reputation model designated by just the Hide Content? diamond in <html><a href="#Figure_10-3">Figure_10-3</a>&nbsp;</html>. See the patent application for more details about the other (nonreputation) portions of the diagram, such as the Notify Author and Appeals process boxes.
Line 201: Line 203:
    * The customer care staff are the target of the model. The goal is to reduce the staff's participation in the content moderation process as much as possible, but not to zero. Any community content moderation process can be abused: trusted users may decide to abuse their power, or they may simply make a mistake. Customer care would still evaluate appeals in those cases, but the number of such cases would be far fewer than the total number of abuses.     * The customer care staff are the target of the model. The goal is to reduce the staff's participation in the content moderation process as much as possible, but not to zero. Any community content moderation process can be abused: trusted users may decide to abuse their power, or they may simply make a mistake. Customer care would still evaluate appeals in those cases, but the number of such cases would be far fewer than the total number of abuses.
Customer care agents also have a reputation-for accuracy-though it isn't calculated by this model. At the start of the Yahoo! Answers community content moderation project, the accuracy of a customer care agent's evaluation of questions was about 90%. That rate meant that 1 in 10 submissions was either incorrectly deleted or incorrectly allowed to remain on the site. An important measure of the model's effectiveness was whether users' evaluations were more accurate than the staff's. Customer care agents also have a reputation-for accuracy-though it isn't calculated by this model. At the start of the Yahoo! Answers community content moderation project, the accuracy of a customer care agent's evaluation of questions was about 90%. That rate meant that 1 in 10 submissions was either incorrectly deleted or incorrectly allowed to remain on the site. An important measure of the model's effectiveness was whether users' evaluations were more accurate than the staff's.
-The design included two documents that are worthy of note, though they were not formal objects (that is, they neither provided input nor were reputable entities). The Yahoo! terms of service and the Yahoo! Answers community guidelines are the written standards for questions and answers. Users are supposed to apply these rules in evaluating content.+The design included two documents that are worthy of note, though they were not formal objects (that is, they neither provided input nor were reputable entities). The Yahoo! terms of service and the Yahoo! Answers community guidelines (<html><a href="#Figure_10-4">Figure_10-4</a>&nbsp;</html>) are the written standards for questions and answers. Users are supposed to apply these rules in evaluating content.
<html><a name="Figure_10-4"><center></html>// Figure_10-4: Yahoo! Answers Community Guidelines //<html></center></a></html> <html><a name="Figure_10-4"><center></html>// Figure_10-4: Yahoo! Answers Community Guidelines //<html></center></a></html>
-<html><center><img width="70%" src="http://buildingreputation.com/lib/exe/fetch.php?media=Ch12-YahooAnswersCommunityGuidelines.png"/></center></html>+<html><center><img width="70%" src="http://buildingreputation.com/lib/exe/fetch.php?media=Figure_10-4.png"/></center></html>
<html><a name='Chap_10-Limiting_Scope'></a></html> <html><a name='Chap_10-Limiting_Scope'></a></html>
Line 220: Line 222:
When you develop a reputation model, it's good practice to start simple: focus only on the main objects, inputs, decisions, and uses. Assume a universe in which the model works exactly as intended. Don't focus too much on performance or abuse at first-you'll get to those issues in later iterations. Trying to solve this kind of complex equation in all dimensions simultaneously will just lead to confusion and impede your progress. When you develop a reputation model, it's good practice to start simple: focus only on the main objects, inputs, decisions, and uses. Assume a universe in which the model works exactly as intended. Don't focus too much on performance or abuse at first-you'll get to those issues in later iterations. Trying to solve this kind of complex equation in all dimensions simultaneously will just lead to confusion and impede your progress.
-For the Yahoo! Answers community content moderation system, the designers started with a very basic model-abuse reports would accumulate against a content item, and when some threshold was reached, the item would be hidden. This model, sometimes called "X-strikes-and-you're-out," is quite common in social web applications. Craigslist is a well-known example.+For the Yahoo! Answers community content moderation system, the designers started with a very basic model-abuse reports would accumulate against a content item, and when some threshold was reached, the item would be hidden. This model, sometimes called “X-strikes-and-you're-out,is quite common in social web applications. Craigslist is a well-known example.
Despite the apparent complexity of the final application, the model's simple core design remained unchanged: accumulated abuse reports automatically hide content. Having that core design to keep in mind as the key goal in mind helped eliminate complications in the design. Despite the apparent complexity of the final application, the model's simple core design remained unchanged: accumulated abuse reports automatically hide content. Having that core design to keep in mind as the key goal in mind helped eliminate complications in the design.
Line 231: Line 233:
  * **Abuse Report (User Input)**   * **Abuse Report (User Input)**
-    * Users could report content that violated the community guidelines or the terms of service. The user interface consisted of a button next to all questions and answers. The button was labeled with an icon of a flag, and sometimes the action of clicking the button was referred to as "flagging an item." In the case of questions, the button label also included the phrase "Report Abuse." The interface then led the user through a short series of pages to explain the process and narrow down the reason for the report.+    * Users could report content that violated the community guidelines or the terms of service. The user interface consisted of a button next to all questions and answers. The button was labeled with an icon of a flag, and sometimes the action of clicking the button was referred to as “flagging an item.In the case of questions, the button label also included the phrase “Report Abuse.The interface then led the user through a short series of pages to explain the process and narrow down the reason for the report.
The abuse report was the only input in the first iteration of the model. The abuse report was the only input in the first iteration of the model.
Line 239: Line 241:
At the core of the model was a simple, binary decision: should a content item that has just been reported as abusive be hidden? How does the model make the decision, and, if the result is positive, how should the application be notified? At the core of the model was a simple, binary decision: should a content item that has just been reported as abusive be hidden? How does the model make the decision, and, if the result is positive, how should the application be notified?
-In the first iteration, the model for this decision was "three strikes and you're out." (See <html><a href="#Figure_10-1">Figure_10-1</a>&nbsp;</html>.) Abuse reports fed into a simple accumulator (<html><a href="/doku.php?id=Chapter_3#Chap_3-Simple_Accumulator">Chap_3-Simple_Accumulator</a>&nbsp;</html>). Each report about a content item was given equal weight; all reports were added together and stored as ''ContentItemAbuse'' . That score was sent on to a simple evaluator, which tested it against a threshold (3) and either terminated it (if the threshold had not been reached) or alerted the application to hide the item.+In the first iteration, the model for this decision was “three strikes and you're out.(See <html><a href="#Figure_10-5">Figure_10-5</a>&nbsp;</html>.) Abuse reports fed into a simple accumulator (<html><a href="/doku.php?id=Chapter_3#Chap_3-Simple_Accumulator">Chap_3-Simple_Accumulator</a>&nbsp;</html>). Each report about a content item was given equal weight; all reports were added together and stored as ''ContentItemAbuse'' . That score was sent on to a simple evaluator, which tested it against a threshold (3) and either terminated it (if the threshold had not been reached) or alerted the application to hide the item.
Given that performance was a key requirement for this model, the abuse reports were delivered asynchronously, and the outgoing alert to the application used an application-level messaging system. Given that performance was a key requirement for this model, the abuse reports were delivered asynchronously, and the outgoing alert to the application used an application-level messaging system.
Line 245: Line 247:
This iteration of the model did not include karma. This iteration of the model did not include karma.
-<html><a name="Figure_10-5"><center></html>// Figure_10-5: Iteration 1: A not-very-forgiving model. Three strikes and your content is out. //<html></center></a></html> +<html><a name="Figure_10-5"><center></html>// Figure_10-5: Iteration 1: A not-very-forgiving model. Three strikes and your content is out! //<html></center></a></html> 
-<html><center><img width="70%" src="http://buildingreputation.com/lib/exe/fetch.php?media=Ch12-YahooAnswersModelIteration1.png"/></center></html>+<html><center><img width="70%" src="http://buildingreputation.com/lib/exe/fetch.php?media=Figure_10-5.png"/></center></html>
Line 259: Line 261:
== Iteration 2: Karma for Abuse Reporters == == Iteration 2: Karma for Abuse Reporters ==
-Ideally, the more abuse a user reports accurately, the greater the trust the system should place in that user's reports. In the second iteration of the model, when a trusted reporter flagged an item, it was hidden immediately. Trusted reporters had proven, over time, that their motivations were pure, their comprehension of community standards was good, and their word could be taken at face value.+Ideally, the more abuse a user reports accurately, the greater the trust the system should place in that user's reports. In the second iteration of the model, shown in <html><a href="#Figure_10-6">Figure_10-6</a>&nbsp;</html>, when a trusted reporter flagged an item, it was hidden immediately. Trusted reporters had proven, over time, that their motivations were pure, their comprehension of community standards was good, and their word could be taken at face value.
Reports by users who had never previously reported an item, with unknown reputation, were all given equal weight, but it was significantly lower than reports by users with a positive history. In this model, individual unknown reporters had less influence on any one content item, but the votes of different individuals could accrue quickly. (At the same time, the individuals accrued their own reporting histories, so unknown reporters didn't stay unknown for long.) Reports by users who had never previously reported an item, with unknown reputation, were all given equal weight, but it was significantly lower than reports by users with a positive history. In this model, individual unknown reporters had less influence on any one content item, but the votes of different individuals could accrue quickly. (At the same time, the individuals accrued their own reporting histories, so unknown reporters didn't stay unknown for long.)
-Though you might think that "bad" reporters (those whose reports were later overturned on appeal) should have less say than unknown users, the model gave equal weight to reports from bad reporters and unknown reporters. See <html><a href="/doku.php?id=Chapter_6#Chap_6-Negative_Public_Karma">Chap_6-Negative_Public_Karma</a>&nbsp;</html>.+Though you might think that “bad” reporters (those whose reports were later overturned on appeal) should have less say than unknown users, the model gave equal weight to reports from bad reporters and unknown reporters. See <html><a href="/doku.php?id=Chapter_6#Chap_6-Negative_Public_Karma">Chap_6-Negative_Public_Karma</a>&nbsp;</html>.
Line 272: Line 274:
  * **Item Hidden (Moderation Model Feedback)**   * **Item Hidden (Moderation Model Feedback)**
-    * The system sent an "item hidden" input message when the reputation process determined that a question or answer should be hidden which represented that all users who reported the content item agreed that the item was in violation of either the TOS or the community guidelines.+    * The system sent an “item hidden” input message when the reputation process determined that a question or answer should be hidden which represented that all users who reported the content item agreed that the item was in violation of either the TOS or the community guidelines.
  * **Appeal Result: Upheld (Customer Care Input)**   * **Appeal Result: Upheld (Customer Care Input)**
Line 282: Line 284:
** Mechanism and Diagram ** ** Mechanism and Diagram **
-The designers transformed the overly simple "strikes"-based model to account for a user's abuse report history.+The designers transformed the overly simple “strikes” -based model to account for a user's abuse report history.
Goals: Decrease the time required to hide abusive content. Reduce the risk of inexperienced or bad actors hiding content inappropriately. Goals: Decrease the time required to hide abusive content. Reduce the risk of inexperienced or bad actors hiding content inappropriately.
Line 288: Line 290:
Solution: Add ''AbuseReporter'' karma to record the user's accuracy in hiding abusive content. Use ''AbuseReporter'' to give greater weight to reports by users with a history of accurate abuse reporting. Solution: Add ''AbuseReporter'' karma to record the user's accuracy in hiding abusive content. Use ''AbuseReporter'' to give greater weight to reports by users with a history of accurate abuse reporting.
-<html><a name="Figure_10-6"><center></html>// Figure_10-6: Iteration 2: A reporter's record now influences the weight of his opinion on other content items. //<html></center></a></html> +<html><a name="Figure_10-6"><center></html>// Figure_10-6: Iteration 2: A reporter's record of good and bad reports now influences the weight of his opinion on other content items. //<html></center></a></html> 
-<html><center><img width="70%" src="http://buildingreputation.com/lib/exe/fetch.php?media=Ch12-YahooAnswersModelIteration2.png"/></center></html>+<html><center><img width="70%" src="http://buildingreputation.com/lib/exe/fetch.php?media=Figure_10-6.png"/></center></html>
To accommodate the varying weight of abuse reports, the designers changed the calculation of ''ContentItemAbuse'' from strikes to a normalized value, where 0.0 represented no abuse information known and 1.0 represented the maximum abuse value. The evaluator now compared the ''ContentItemAbuse'' score to a normalized value representing the certainty required before hiding an item. To accommodate the varying weight of abuse reports, the designers changed the calculation of ''ContentItemAbuse'' from strikes to a normalized value, where 0.0 represented no abuse information known and 1.0 represented the maximum abuse value. The evaluator now compared the ''ContentItemAbuse'' score to a normalized value representing the certainty required before hiding an item.
-The designers added a new process to the model, "update report karma," which maintained the ''AbuseReporter'' reputation claim, a normalized value, where 0.0 represented a user with no history of abuse reporting and 1.0 represented a user with a completely accurate abuse reporting history. A user with a perfect score of 1.0 could hide any item immediately.+The designers added a new process to the model, “update report karma,which maintained the ''AbuseReporter'' reputation claim, a normalized value, where 0.0 represented a user with no history of abuse reporting and 1.0 represented a user with a completely accurate abuse reporting history. A user with a perfect score of 1.0 could hide any item immediately.
The inputs that increased ''AbuseReporter'' were ''Item Hidden'' and ''Appeal Result: Upheld'' . The input ''Abuse Result: Overturned'' had a disproportionately large negative effect on ''AbuseReporter'' , providing an incentive for reporters not to use their power indiscriminately. The inputs that increased ''AbuseReporter'' were ''Item Hidden'' and ''Appeal Result: Upheld'' . The input ''Abuse Result: Overturned'' had a disproportionately large negative effect on ''AbuseReporter'' , providing an incentive for reporters not to use their power indiscriminately.
-Unlike the first process, the new "update item abuse" process did not treat each input the same way. It read the reporter's ''AbuseReporter'' karma, added a small constant to ''ContentItemAbuse'' (so that users with no karma made at least a small contribution to the result), and capped the result at the maximum. If the result was 1.0, the system hid the item but, in addition to alerting the application, it also sent an "item hidden" message for each user who flagged the item as hidden. This message represented community consensus and, since the vast majority of hidden items would never be reviewed by customer care, was often the only opportunity the system had to reinforce the karma of those users. Very few appeals were anticipated given that trolls were known to give bogus email addresses when registering (and therefore can never appeal). The incentives for both the legitimate authors and good abuse reporters discourage abusing the community moderation model.+Unlike the first process, the new “update item abuse” process did not treat each input the same way. It read the reporter's ''AbuseReporter'' karma, added a small constant to ''ContentItemAbuse'' (so that users with no karma made at least a small contribution to the result), and capped the result at the maximum. If the result was 1.0, the system hid the item but, in addition to alerting the application, it also sent an “item hidden” message for each user who flagged the item as hidden. This message represented community consensus and, since the vast majority of hidden items would never be reviewed by customer care, was often the only opportunity the system had to reinforce the karma of those users. Very few appeals were anticipated given that trolls were known to give bogus email addresses when registering (and therefore can never appeal). The incentives for both the legitimate authors and good abuse reporters discourage abusing the community moderation model.
-The system sent "appeal results" messages asynchronously as part of the customer care application; the messages could come in at any time. After ''AbuseReporter'' was adjusted, the system did not attempt to update other ''ContentItemAbuse'' scores the reporter may have contributed to.+The system sent “appeal results” messages asynchronously as part of the customer care application; the messages could come in at any time. After ''AbuseReporter'' was adjusted, the system did not attempt to update other ''ContentItemAbuse'' scores the reporter may have contributed to.
Line 353: Line 355:
** Mechanism and Diagram ** ** Mechanism and Diagram **
-In the third iteration of the model, the designers created several new reputation scores for questions and answers and a new user role with a karma-that of //author// of the flagged content. Those additions more than doubled the complexity compared to the previous iteration. But if you consider each iteration as a separate reputation model (which is logical because each addition stands alone), each one is simple. By integrating separable small models, the combination made up a full-blown reputation system. For example, the karmas introduced by the new models-''QuestionAuthor'' karma, ''QuestionAnswer'' karma, and ''AbusiveContent'' karma-could find uses in contexts other than hiding abusive content.+In the third iteration of the model, the designers created several new reputation scores for questions and answers and a new user role with a karma-that of //author// of the flagged content. Those additions more than doubled the complexity compared to the previous iteration, as illustrated in <html><a href="#Figure_10-7">Figure_10-7</a>&nbsp;</html>. But if you consider each iteration as a separate reputation model (which is logical because each addition stands alone), each one is simple. By integrating separable small models, the combination made up a full-blown reputation system. For example, the karmas introduced by the new models-''QuestionAuthor'' karma, ''QuestionAnswer'' karma, and ''AbusiveContent'' karma-could find uses in contexts other than hiding abusive content.
<html><a name="Figure_10-7"><center></html>// Figure_10-7: Iteration 3: This improved iteration of the model now also accounts for the history of a content author. When users flag a question or answer, the system gives extra consideration to authors with a history of posting good content. //<html></center></a></html> <html><a name="Figure_10-7"><center></html>// Figure_10-7: Iteration 3: This improved iteration of the model now also accounts for the history of a content author. When users flag a question or answer, the system gives extra consideration to authors with a history of posting good content. //<html></center></a></html>
-<html><center><img width="70%" src="http://buildingreputation.com/lib/exe/fetch.php?media=Ch12-YahooAnswersModelIteration3.png"/></center></html>+<html><center><img width="70%" src="http://buildingreputation.com/lib/exe/fetch.php?media=Figure_10-7.png"/></center></html>
-In this iteration the designers added two new main karma tracks, represented by the parallel messaging tracks for question karma and answer karma. The calculations are so similar that we'll present the description only once, using "item" to represent either //answer// or //question//.+In this iteration the designers added two new main karma tracks, represented by the parallel messaging tracks for question karma and answer karma. The calculations are so similar that we'll present the description only once, using “item” to represent either //answer// or //question//.
The system gave each item a quality reputation [''QuestionQuality | AnswerQuality'' ], which started as the average of the quality reputations of the previously contributed items [''AuthorAverageQuestionQuality | AuthorAverageAnswerQuality'' ] and a bit of the Junk Detector score. As either positive (stars, ratings, shares) or negative inputs (items hidden by customer care staff) changed the scores, the averages and karmas in turn were immediately affected. Each positive input was restricted by weights and limits; for example, only the first 10 users marking an item as a favorite were considered, and each could contribute a maximum of 0.5 to the final quality score. This meant that increasing the item quality reputation required many different types of positive inputs. The system gave each item a quality reputation [''QuestionQuality | AnswerQuality'' ], which started as the average of the quality reputations of the previously contributed items [''AuthorAverageQuestionQuality | AuthorAverageAnswerQuality'' ] and a bit of the Junk Detector score. As either positive (stars, ratings, shares) or negative inputs (items hidden by customer care staff) changed the scores, the averages and karmas in turn were immediately affected. Each positive input was restricted by weights and limits; for example, only the first 10 users marking an item as a favorite were considered, and each could contribute a maximum of 0.5 to the final quality score. This meant that increasing the item quality reputation required many different types of positive inputs.
Line 366: Line 368:
The system then combined the ''QuestionAuthor'' and ''AnswerAuthor'' karmas into ''ContentAuthor'' karma, using the best (the larger) of the two values. That approach reflected the insight of Yahoo! Answers staff that people who ask good questions are not the same as people who give good answers. The system then combined the ''QuestionAuthor'' and ''AnswerAuthor'' karmas into ''ContentAuthor'' karma, using the best (the larger) of the two values. That approach reflected the insight of Yahoo! Answers staff that people who ask good questions are not the same as people who give good answers.
-The designers once again changed the "Hide Content?" process, now comparing ''ContentItemAbuse'' to the new ''ContentAuthor'' karma to determine whether the content should be hidden. When an item was hidden, that information was sent as an input into a new process that updated the ''AbusiveContent'' karma.+The designers once again changed the “Hide Content?process, now comparing ''ContentItemAbuse'' to the new ''ContentAuthor'' karma to determine whether the content should be hidden. When an item was hidden, that information was sent as an input into a new process that updated the ''AbusiveContent'' karma.
The new process for updating ''AbusiveContent'' karma also incorporated the inputs from customer care staff that were included in iteration 2-appeal results and content removals-which affected the karma either positively or negatively, as appropriate. Whenever an input entered that process, the system sent a message with the updated score to each of the processes for updating question and answer karma. The new process for updating ''AbusiveContent'' karma also incorporated the inputs from customer care staff that were included in iteration 2-appeal results and content removals-which affected the karma either positively or negatively, as appropriate. Whenever an input entered that process, the system sent a message with the updated score to each of the processes for updating question and answer karma.
Line 438: Line 440:
<html><a name="Figure_10-8"><center></html>// Figure_10-8: Final model: Eliminating the cold-start problem by giving good users an upfront advantage as abuse reporters. //<html></center></a></html> <html><a name="Figure_10-8"><center></html>// Figure_10-8: Final model: Eliminating the cold-start problem by giving good users an upfront advantage as abuse reporters. //<html></center></a></html>
-<html><center><img width="70%" src="http://buildingreputation.com/lib/exe/fetch.php?media=Ch12-YahooAnswersModelIteration4.png"/></center></html>+<html><center><img width="70%" src="http://buildingreputation.com/lib/exe/fetch.php?media=Figure_10-8.png"/></center></html>
We detail each new process in <html><a href="#Figure_10-8">Figure_10-8</a>&nbsp;</html>below. We detail each new process in <html><a href="#Figure_10-8">Figure_10-8</a>&nbsp;</html>below.
Line 444: Line 446:
  * **Process: Calculate Suspicious Connection**   * **Process: Calculate Suspicious Connection**
-    * When a user takes an action of value, such as asking a question, giving an answer, or evaluating content on the site, the application stores the user's connection information. If the user's IP address or browser cookie differed from the one used in a previous session, the application activates this process by sending it the IP and/or browser cookie related inputs. The system updated the ''SuspectedAbuser'' karma using those values and the history of previous values for the user. Then it sent the value in a message to the "generate abuse reporter bootstrap" process.+    * When a user takes an action of value, such as asking a question, giving an answer, or evaluating content on the site, the application stores the user's connection information. If the user's IP address or browser cookie differed from the one used in a previous session, the application activates this process by sending it the IP and/or browser cookie related inputs. The system updated the ''SuspectedAbuser'' karma using those values and the history of previous values for the user. Then it sent the value in a message to the “generate abuse reporter bootstrap” process.
  * **Process: Calculate User Community Investment**   * **Process: Calculate User Community Investment**
Line 451: Line 453:
    * Selection of a best answer to a question-whether or not the user wrote the answer that was selected     * Selection of a best answer to a question-whether or not the user wrote the answer that was selected
    * The first time the user flags any item as abusive content     * The first time the user flags any item as abusive content
-This process generated ''CommunityInvestment'' karma by accounting for the longevity of the user's participation in Yahoo! Answers and the age of the user's Yahoo! account, along with a simple participation value calculation (the user's level) and an approximation of answer quality-the best answer percentage. Each time this value was changed, the system sent the new value to the "generate abuse reporter bootstrap" process.+This process generated ''CommunityInvestment'' karma by accounting for the longevity of the user's participation in Yahoo! Answers and the age of the user's Yahoo! account, along with a simple participation value calculation (the user's level) and an approximation of answer quality-the best answer percentage. Each time this value was changed, the system sent the new value to the “generate abuse reporter bootstrap” process.
  * **Process: Calculate Content Abuser Karma**   * **Process: Calculate Content Abuser Karma**
-    * The inputs and calculations for this process were the same as in the third iteration of the model - the process remained a repository for all confirmed and nonappealed user content violations. The only difference was that every time the system executed the process and updated ''AbusiveContent'' karma, it now sent an additional message to the "generate abuse reporter bootstrap" process.+    * The inputs and calculations for this process were the same as in the third iteration of the model - the process remained a repository for all confirmed and nonappealed user content violations. The only difference was that every time the system executed the process and updated ''AbusiveContent'' karma, it now sent an additional message to the “generate abuse reporter bootstrap” process.
  * **Process: Generate Abuse Reporter Bootstrap**   * **Process: Generate Abuse Reporter Bootstrap**
-    * This process was the centerpiece of the final iteration of the model. The ''TrustBootstrap'' reputation represented the system's best guess at the reputation of users without a long history of transactions with the service. It was a weighted mixer process, taking positive input from ''CommunityInvestment'' karma and weighing that against two negative scores: the weaker score was the connection-based ''SuspectedAbuser'' karma, and the stronger score was the user-history-based ''AbusiveContent'' karma. Even though a high value for the ''TrustBootstrap'' reputation implied a high level of certainty that a user would violate the rules, ''AbusiveContent'' karma made up only a share of the bootstrap and not all of it. The reason was that the context for the score was content quality, and the context of the bootstrap was reporter reliability: someone who is great at evaluating content might suck at creating it. Each time the bootstrap process was updated, it was passed along to the final process in the model: "update abuse reporter karma."+    * This process was the centerpiece of the final iteration of the model. The ''TrustBootstrap'' reputation represented the system's best guess at the reputation of users without a long history of transactions with the service. It was a weighted mixer process, taking positive input from ''CommunityInvestment'' karma and weighing that against two negative scores: the weaker score was the connection-based ''SuspectedAbuser'' karma, and the stronger score was the user-history-based ''AbusiveContent'' karma. Even though a high value for the ''TrustBootstrap'' reputation implied a high level of certainty that a user would violate the rules, ''AbusiveContent'' karma made up only a share of the bootstrap and not all of it. The reason was that the context for the score was content quality, and the context of the bootstrap was reporter reliability: someone who is great at evaluating content might suck at creating it. Each time the bootstrap process was updated, it was passed along to the final process in the model: “update abuse reporter karma.
  * **Process: Update Confirmed Reporter Karma**   * **Process: Update Confirmed Reporter Karma**
-    * The input and calculations for this process were the same as in the second iteration of the model-the process updated ''ConfirmedRerporter'' karma to reflect the accuracy of the user's abuse reports. The only difference was that the system now sent a message for each reporter to the "update abuse reporter karma" process, where the claim value was incorporated into the bootstrap reputation.+    * The input and calculations for this process were the same as in the second iteration of the model-the process updated ''ConfirmedRerporter'' karma to reflect the accuracy of the user's abuse reports. The only difference was that the system now sent a message for each reporter to the “update abuse reporter karma” process, where the claim value was incorporated into the bootstrap reputation.
  * **Process: Update Abuse Reporter Karma**   * **Process: Update Abuse Reporter Karma**
Line 511: Line 513:
=== Application Integration === === Application Integration ===
-The full model as shown in <html><a href="#Figure_10-8">Figure_10-8</a>&nbsp;</html>has dozens of possible inputs, and many different programmers managed the different sections of the application. The designers had to perform a comprehensive review of all of the pages to determine where the new "Report Abuse" buttons should appear. More important, the application had to account for a new internal database status-"hidden"-for every question and answer on every page that displayed content. Hiding an item had important side effects on the application: it had to adjust total counts, revoke points granted, and a policy had to be devised and followed on handling any answers (and associated points) attached to any hidden questions.+The full model as shown in <html><a href="#Figure_10-8">Figure_10-8</a>&nbsp;</html>has dozens of possible inputs, and many different programmers managed the different sections of the application. The designers had to perform a comprehensive review of all of the pages to determine where the new “Report Abuse” buttons should appear. More important, the application had to account for a new internal database status-“hidden” -for every question and answer on every page that displayed content. Hiding an item had important side effects on the application: it had to adjust total counts, revoke points granted, and a policy had to be devised and followed on handling any answers (and associated points) attached to any hidden questions.
Integrating the new model required entirely new flows on the site for reporting abuse and handling appeals. The appeals part of the model required that the application send email to users, functionality previously reserved for opt-in watch lists and marketing-related mailings; appeals mailings were neither. Last, the customer care management application would need to be altered. Integrating the new model required entirely new flows on the site for reporting abuse and handling appeals. The appeals part of the model required that the application send email to users, functionality previously reserved for opt-in watch lists and marketing-related mailings; appeals mailings were neither. Last, the customer care management application would need to be altered.
Line 519: Line 521:
=== Testing Is Harder Than You Think === === Testing Is Harder Than You Think ===
-Just as the design was iterative, so too were the implementation and testing. In <html><a href="/doku.php?id=Chapter_9#Chap_9-Testing">Chap_9-Testing</a>&nbsp;</html>, we suggested building and testing a model in pieces. The Yahoo! Answers team did just that, using constant values for the missing processes and inputs. The most important thing to get working was the basic input flow: when a user clicked "Report Abuse," that action was tested against a threshold (initially a constant), and when it was exceeded, the reputation system sent a message back to the application to hide the item - effectively removing it from the site.+Just as the design was iterative, so too were the implementation and testing. In <html><a href="/doku.php?id=Chapter_9#Chap_9-Testing">Chap_9-Testing</a>&nbsp;</html>, we suggested building and testing a model in pieces. The Yahoo! Answers team did just that, using constant values for the missing processes and inputs. The most important thing to get working was the basic input flow: when a user clicked “Report Abuse,that action was tested against a threshold (initially a constant), and when it was exceeded, the reputation system sent a message back to the application to hide the item - effectively removing it from the site.
Once the basic input flow had been stabilized, the engineers added other features and connected additional inputs. Once the basic input flow had been stabilized, the engineers added other features and connected additional inputs.
Line 596: Line 598:
100 times the baseline<br/> 100 times the baseline<br/>
Saved >$990,000 per year<br/> Saved >$990,000 per year<br/>
-</td></tr></tbody></table></html>Every goal was shattered, and over time the results improved even further. As Yahoo! Answers product designer Micah Alpern put it: "Things got better because things were getting better!"+</td></tr></tbody></table></html>Every goal was shattered, and over time the results improved even further. As Yahoo! Answers product designer Micah Alpern put it: “Things got better because things were getting better!
That phenomenon was perhaps best illustrated by another unexpected result about a month after the full system was deployed: both the number of abuse reports and requests for appeal dropped drastically over a few weeks. At first the team wondered if something was broken-but it didn't appear so, since a recent quality audit of the service showed that overall quality was still on the rise. User abuse reports resulted in hiding hundreds of items each day, but the total appeals dropped to a single-digit number, usually just 1 or 2, per day. What had happened? That phenomenon was perhaps best illustrated by another unexpected result about a month after the full system was deployed: both the number of abuse reports and requests for appeal dropped drastically over a few weeks. At first the team wondered if something was broken-but it didn't appear so, since a recent quality audit of the service showed that overall quality was still on the rise. User abuse reports resulted in hiding hundreds of items each day, but the total appeals dropped to a single-digit number, usually just 1 or 2, per day. What had happened?
Line 623: Line 625:
  * Explain what abuse means in your application.   * Explain what abuse means in your application.
-In the case of Yahoo! Answers, content must obey two different sets of rules: the terms of service and the community guidelines. Clearly describing each category and teaching the community what is (and isn't) reportable is critical to getting users to succeed as reporters as well as content creators.+In the case of Yahoo! Answers, content must obey two different sets of rules: the terms of service and the community guidelines. Clearly describing each category and teaching the community what is (and isn't) reportable is critical to getting users to succeed as reporters as well as content creators. (<html><a href="#Figure_10-9">Figure_10-9</a>&nbsp;</html>)
<html><a name="Figure_10-9"><center></html>// Figure_10-9: Reporting abuse: distinguish the terms of service from the community guidelines //<html></center></a></html> <html><a name="Figure_10-9"><center></html>// Figure_10-9: Reporting abuse: distinguish the terms of service from the community guidelines //<html></center></a></html>
-<html><center><img width="70%" src="http://buildingreputation.com/lib/exe/fetch.php?media=Ch12-YahooAnswersReportAbuse.png"/></center></html>+<html><center><img width="70%" src="http://buildingreputation.com/lib/exe/fetch.php?media=Figure_10-9.png"/></center></html>
  * Explain the reputation effects of an abuse report.   * Explain the reputation effects of an abuse report.
-Abuse reporter reputation was not displayed. Reporters didn't even know their own reputation score. But active users knew the effects of having a good abuse reporter reputation-most content that they reported was hidden instantly. What they didn't understand was what specific actions would increase or decrease it. As shown in <html><a href="#Figure_10-9">Figure_10-9</a>&nbsp;</html>, the Yahoo! Answers site clearly explained that the site rewarded accuracy of reports, not volume. That was an important distinction because Yahoo! Answers points (and levels) were based mostly on participation karma-where doing more things gets you more karma. Active users understood that relationship. The new abuse reporter karma didn't work that way. In fact, reporting abuse was one of the few actions the user could take on the site that //didn't// generate Yahoo! Answers points. +Abuse reporter reputation was not displayed. Reporters didn't even know their own reputation score. But active users knew the effects of having a good abuse reporter reputation-most content that they reported was hidden instantly. What they didn't understand was what specific actions would increase or decrease it. As shown in <html><a href="#Figure_10-10">Figure_10-10</a>&nbsp;</html>, the Yahoo! Answers site clearly explained that the site rewarded accuracy of reports, not volume. That was an important distinction because Yahoo! Answers points (and levels) were based mostly on participation karma-where doing more things gets you more karma. Active users understood that relationship. The new abuse reporter karma didn't work that way. In fact, reporting abuse was one of the few actions the user could take on the site that //didn't// generate Yahoo! Answers points. 
-<html><a name="Figure_10-10"><center></html>// Figure_10-10: Reporting abuse: explain reputation effects to abuse reporters //<html></center></a></html> +<html><a name="Figure_10-10"><center></html>// Figure_10-10: Reporting abuse: explain reputation effects to abuse reporters. //<html></center></a></html> 
-<html><center><img width="70%" src="http://buildingreputation.com/lib/exe/fetch.php?media=Ch12-YahooAnswersThankYouForReporting.png"/></center></html>+<html><center><img width="70%" src="http://buildingreputation.com/lib/exe/fetch.php?media=Figure_10-10.png"/></center></html>
chapter_10.txt · Last modified: 2009/12/01 14:53 by randy
www.chimeric.de Creative Commons License Valid CSS Driven by DokuWiki do yourself a favour and use a real browser - get firefox!! Recent changes RSS feed Valid XHTML 1.0