This shows you the differences between two versions of the page.
chapter_3 [2009/11/01 17:05] randy created |
chapter_3 [2023/03/12 12:11] (current) |
||
---|---|---|---|
Line 1: | Line 1: | ||
===== Building Blocks and Reputation Tips ===== | ===== Building Blocks and Reputation Tips ===== | ||
- | By now you should feel fairly conversant in the lingua franca of reputation systems, and you've had some exposure to their constituent bits and pieces. We've gone over reputation statements, messages, and processes, and you've even become familiar some fairly rudimentary but serviceable models. | + | By now you should feel fairly conversant in the //lingua franca// (the graphical grammar presented in <html><a href="/doku.php?id=Chapter_2">Chapter_2</a> </html>) of reputation systems, and you've had some exposure to their constituent bits and pieces. We've gone over reputation statements, messages, and processes, and you've become familiar with some rudimentary-but-serviceable models. |
- | In this chapter, we'll level upand explore reputation claims in greater detail. We'll describe a taxonomy of claim types, exploring reputation roll-ups: actual computations on incoming messages that effect a particular output. Functionally, different types of roll-ups yield very different types of reputations, so we'll offer guidance on when to use which roll-ups. We'll end the chapter with practical advice in a section of practitioner's tricks. | + | In this chapter, we'll “level up” and explore reputation claims in greater detail. We'll describe a taxonomy of claim types, exploring reputation roll-ups: actual computations on incoming messages that effect a particular output. Functionally, different types of roll-ups yield very different types of reputations, so we'll offer guidance on when to use which roll-ups. We'll end the chapter with practical advice in a section of “practitioner's tricks.” |
Line 10: | Line 10: | ||
=== The Data: Claim Types === | === The Data: Claim Types === | ||
- | Remember, a fundamental component of a reputation statement is the //claim//-the assertion of quality that a //source// makes about a //target//. In <html><a href="/doku.php?id=Chapter_1#Chap_1-The_Reputation_Statement">Chap_1-The_Reputation_Statement</a> </html>, we discussed how claims can be either explicit (a direct statement of quality, intended by the statement's source to act as such) or implicit (an activity related to an object, from which we infer the source's interest in the target object). These fundamentally different approaches are important because the //combination// of implicit and explicit claims can yield some very nuanced and robust reputation models. In other words: We should pay attention to what people say, but we should give equal weight to what they //do// to determine which entities hold the community's interest. | + | Remember, a fundamental component of a reputation statement is the //claim//-the assertion of quality that a //source// makes about a //target//. In <html><a href="/doku.php?id=Chapter_1#Chap_1-The_Reputation_Statement">Chap_1-The_Reputation_Statement</a> </html>, we discussed how claims can be either explicit (a direct statement of quality, intended by the statement's source to act as such) or implicit (representing a source-user's concrete actions associated with a target entity). These fundamentally different approaches are important because the //combination// of implicit and explicit claims can yield some very nuanced and robust reputation models. In other words: We should pay attention to what people say, but we should give equal weight to what they //do// to determine which entities hold the community's interest. |
Claims also can be of different //types//. One helpful distinction is between qualitative claims (claims that describe one or more qualities which may or may not be easily //measured//) and quantitative claims-claims that can be measured (and, in fact, are largely generated, communicated and read back as //numbers// of some kind.) | Claims also can be of different //types//. One helpful distinction is between qualitative claims (claims that describe one or more qualities which may or may not be easily //measured//) and quantitative claims-claims that can be measured (and, in fact, are largely generated, communicated and read back as //numbers// of some kind.) | ||
- | Reputation statements have claim values, which you can generally think of as "what you get back when you ask for a reputation's current state." So, for instance, we can always query the system for Movies.Review.Overall.Average and get back a normalized score within the range of 0-1. | + | Reputation statements have claim values, which you can generally think of as “what you get back when you ask for a reputation's current state.” So, for instance, we can always query the system for Movies.Review.Overall.Average and get back a normalized score within the range of 0-1. |
Note that the format of a claim does not always map exactly to the format in which you may wish to display (or, for that matter, gather) that claim. It's more likely that you'd want to translate the Movies.Review.Overall.Average to show your users 3 colored stars (out of 5) instead of a normalized score or percentage. | Note that the format of a claim does not always map exactly to the format in which you may wish to display (or, for that matter, gather) that claim. It's more likely that you'd want to translate the Movies.Review.Overall.Average to show your users 3 colored stars (out of 5) instead of a normalized score or percentage. | ||
Line 20: | Line 20: | ||
== Qualitative Claim Types == | == Qualitative Claim Types == | ||
- | Qualitative claims attempt to describe some quality of a reputable object. This quality may be as general as the object's overall "quality" ("This is an excellent restaurant!") or as specific as some particular dimension or aspect of the entity. ("The cinematography was stunning!") Generally, qualitative claim types are //fuzzier// than hard quantitative claims, so qualitative claims quite often end up being useful implicit claims. | + | Qualitative claims attempt to describe some quality of a reputable object. This quality may be as general as the object's overall “quality” (“This is an excellent restaurant!” ) or as specific as some particular dimension or aspect of the entity. (“The cinematography was stunning!” ) Generally, qualitative claim types are //fuzzier// than hard quantitative claims, so qualitative claims quite often end up being useful implicit claims. |
- | This is not to say, however, that qualitative claims can't have a qualitative value when considered //en masse//: almost any claim type can at least be counted and displayed in the form of some simple cumulative score (or "aggregator"-we discuss the various reputation roll-ups below in <html><a href="/doku.php?id=Chapter_3#Chap_3-Roll-Ups">Chap_3-Roll-Ups</a> </html>.) So while we can't necessarily assign an evaluative score to a user-contributed text comment, for instance (at least not without the rest of the community involved), it's quite common on the Web to see a //count// of the number of comments left about an entity, as a crude indicator of that item's popularity or the level of interest it draws. | + | This is not to say, however, that qualitative claims can't have a qualitative value when considered //en masse//: almost any claim type can at least be counted and displayed in the form of some simple cumulative score (or “aggregator” -we discuss the various reputation roll-ups below in <html><a href="/doku.php?id=Chapter_3#Chap_3-Roll-Ups">Chap_3-Roll-Ups</a> </html>.) So while we can't necessarily assign an evaluative score to a user-contributed text comment, for instance (at least not without the rest of the community involved), it's quite common on the Web to see a //count// of the number of comments left about an entity, as a crude indicator of that item's popularity or the level of interest it draws. |
Here are some common types of qualitative claims. | Here are some common types of qualitative claims. | ||
Line 33: | Line 33: | ||
Text comment fields typically are provided as a freeform means of expression: a little white box that users can fill in any way they choose. However, better social sites will attempt to direct comments by providing guidelines or suggestions on what may be considered on- or off-topic. | Text comment fields typically are provided as a freeform means of expression: a little white box that users can fill in any way they choose. However, better social sites will attempt to direct comments by providing guidelines or suggestions on what may be considered on- or off-topic. | ||
- | Users' comments are usually freeform (unstructured) textual data. They typically are character-constrained in some way, though the constraints vary depending on the context: the character allowance for a message board posting is generally much greater than Twitter's famous 140-character limit. | + | Users' comments are usually freeform (unstructured) textual data. They typically are character-constrained in some way, however the constraints vary depending on the context: the character allowance for a message board posting is generally much greater than Twitter's famous 140-character limit. |
In comment fields, you can choose whether to accommodate rich-text entry and display, and you can apply certain content filters to comments up front (for instance, you can choose to prohibit profanity or disallow fully formed URLs). | In comment fields, you can choose whether to accommodate rich-text entry and display, and you can apply certain content filters to comments up front (for instance, you can choose to prohibit profanity or disallow fully formed URLs). | ||
Line 39: | Line 39: | ||
Comments are often just one component of a larger compound reputation statement. Movie reviews, for instance, typically are a combination of 5-star qualitative claims (and perhaps different ones for particular aspects of the film) and one or more freeform comment-type claims. | Comments are often just one component of a larger compound reputation statement. Movie reviews, for instance, typically are a combination of 5-star qualitative claims (and perhaps different ones for particular aspects of the film) and one or more freeform comment-type claims. | ||
- | Comments are powerful reputation claims when interpreted by humans, but they may not be easy for automated systems to evaluate. The best way to evaluate text comments varies depending on the context. If a comment is just one component of a user review, the comment can contribute to a "completeness" score for that review: reviews with comments are deemed more complete than those without (and, in fact, the comment field may be required for the review to be accepted at all). | + | Comments are powerful reputation claims when interpreted by humans, but they may not be easy for automated systems to evaluate. The best way to evaluate text comments varies depending on the context. If a comment is just one component of a user review, the comment can contribute to a “completeness” score for that review: reviews with comments are deemed more complete than those without (and, in fact, the comment field may be required for the review to be accepted at all). |
If the comments in your system are directed at another contributor's content (for example, user comments about a photo album or message board replies to a thread), consider evaluating comments as a measure of interest or activity around that reputable entity. | If the comments in your system are directed at another contributor's content (for example, user comments about a photo album or message board replies to a thread), consider evaluating comments as a measure of interest or activity around that reputable entity. | ||
Line 49: | Line 49: | ||
<note tip>In our research at Yahoo! we often probed notions of authenticity to look at how readers interpret the veracity of a claim or evaluate the authority or competence of a claimant. | <note tip>In our research at Yahoo! we often probed notions of authenticity to look at how readers interpret the veracity of a claim or evaluate the authority or competence of a claimant. | ||
- | We wanted to know: when people read reviews online (or blog entries, or tweets), what are the specific cues that make them more likely to accept what they're reading as accurate? Is there something about the presentation of material that makes it more trustworthy? Or is it the way the content author is presented? (Does an "expert" badge convince anyone?) | + | We wanted to know: when people read reviews online (or blog entries, or tweets), what are the specific cues that make them more likely to accept what they're reading as accurate? Is there something about the presentation of material that makes it more trustworthy? Or is it the way the content author is presented? (Does an “expert” badge convince anyone?) |
- | Overwhelmingly, time and again, we found that it's the content itself-the review, entry, or comment being evaluated-that makes readers' minds up. If an argument is well stated, if it seems reasonable, and if readers can agree with some aspect of it, then readers are more likely to trust the content-no matter what meta-embellishment or framing it's given. | + | Time and again, we found that it's the content itself-the review, entry, or comment being evaluated-that makes readers' minds up. If an argument is well stated, if it seems reasonable, and if readers can agree with some aspect of it, then readers are more likely to trust the content-no matter what meta-embellishment or framing it's given. |
Conversely, research shows that users don't see poorly written reviews with typos or shoddy logic as coming from legitimate or trustworthy sources. People really do pay attention to content. | Conversely, research shows that users don't see poorly written reviews with typos or shoddy logic as coming from legitimate or trustworthy sources. People really do pay attention to content. | ||
Line 59: | Line 59: | ||
** Media Uploads ** | ** Media Uploads ** | ||
- | Reputation value can be derived from other types of qualitative claim types besides just freeform textual data. Any time a user uploads media-either in response to another piece of content, or as a subcomponent of the primary contribution itself-that activity is worth noting as a claim type. | + | Reputation value can be derived from other types of qualitative claim types besides just freeform textual data. Any time a user uploads media-either in response to another piece of content (see <html><a href="#Figure_3-1">Figure_3-1</a> </html>), or as a subcomponent of the primary contribution itself-that activity is worth noting as a claim type. |
We distinguish textual claims from other media for two reasons: | We distinguish textual claims from other media for two reasons: | ||
- | - While text comments typically are entered in context (users type them right into the browser as they interact with your site), media uploads usually require a slightly deeper level of commitment and planning on the user's part. For example, a user might need to use an external device of some kind and edit the media in some way before uploading it. Therefore . . . | + | - While text comments typically are entered in context (users type them right into the browser as they interact with your site), media uploads usually require a slightly deeper level of commitment and planning on the user's part. For example, a user might need to use an external device of some kind and edit the media in some way before uploading it. Therefore… |
- You may want to weight these types of contributions differently from text comments-or not, depending on the context - reflecting increased contribution value. | - You may want to weight these types of contributions differently from text comments-or not, depending on the context - reflecting increased contribution value. | ||
- | <html><a name="Figure_3-1"><center></html>// Figure_3-1: YouTube video responses contribute to interest reputation. //<html></center></a></html> | + | <html><a name="Figure_3-1"><center></html>// Figure_3-1: Video Responsesto a YouTube Video may boost its interest reputation. //<html></center></a></html> |
- | <html><center><img width="70%" src="http://buildingreputation.com/lib/exe/fetch.php?media=Ch04-MediaUploads.jpg"/></center></html> | + | <html><center><img width="70%" src="http://buildingreputation.com/lib/exe/fetch.php?media=Figure_3-1.png"/></center></html> |
A media upload consists of qualitative claim types that is not textual in nature. | A media upload consists of qualitative claim types that is not textual in nature. | ||
Line 97: | Line 97: | ||
* Some shopping review sites encourage cross-linking to other products or offsite resources as an indicator of review completeness: cross-linking demonstrates that the review author has done her homework and fully considered all options. | * Some shopping review sites encourage cross-linking to other products or offsite resources as an indicator of review completeness: cross-linking demonstrates that the review author has done her homework and fully considered all options. | ||
- | * On blogs, the trackback feature originally had some value as an externally verifiable indicator of a post's quality or interest level. (Sadly, though, trackbacks have been a highly gamed spam mechanism for years.) | + | * On blogs, the trackback feature originally had some value as an externally verifiable indicator of a post's quality or interest level. (Sadly, however, trackbacks have been a highly gamed spam mechanism for years.) |
== Quantitative Claim Types == | == Quantitative Claim Types == | ||
Line 109: | Line 109: | ||
One strength of normalized values is their general flexibility. They are the easiest of all quantitative types to perform math operations on; they are the only quantitative claim type that is finitely bounded; and they allow reputation inputs gathered in a number of different formats to be normalized with ease (and then de-normalized back to a display-specific form suitable for the context you want to display in). | One strength of normalized values is their general flexibility. They are the easiest of all quantitative types to perform math operations on; they are the only quantitative claim type that is finitely bounded; and they allow reputation inputs gathered in a number of different formats to be normalized with ease (and then de-normalized back to a display-specific form suitable for the context you want to display in). | ||
- | Another strength of normalized value is the general utility of the format: normalizing data is the only way to perform cross-object and cross-reputation comparisons with any certainty. (Do you want your application to display "5-star restaurants" alongside "4-star hotels"? If so, you'd better normalize those scores somewhere.) | + | Another strength of normalized value is the general utility of the format: normalizing data is the only way to perform cross-object and cross-reputation comparisons with any certainty. (Do you want your application to display “5-star restaurants” alongside “4-star hotels” ? If so, you'd better normalize those scores somewhere.) |
Normalized values are also highly readable: because the bounds of a normalized score are already known, they are very easy (for you, the system architect, or others with access to the data) to read at a glance. With normalized scores, you do not need to understand the context of a score to be able to understand its value as a claim. Very little interpretation is needed. | Normalized values are also highly readable: because the bounds of a normalized score are already known, they are very easy (for you, the system architect, or others with access to the data) to read at a glance. With normalized scores, you do not need to understand the context of a score to be able to understand its value as a claim. Very little interpretation is needed. | ||
Line 116: | Line 116: | ||
** Rank Value ** | ** Rank Value ** | ||
- | A rank value is a unique positive integer. A set of rank values is limited to the number of targets in a bounded set of targets. For example, given a data set of "100 Movies from the Summer of 2009," it is possible to have a ranked list in which each movie has exactly one value. | + | A rank value is a unique positive integer. A set of rank values is limited to the number of targets in a bounded set of targets. For example, given a data set of “100 Movies from the Summer of 2009,” it is possible to have a ranked list in which each movie has exactly one value. |
Here are some examples of uses for rank values: | Here are some examples of uses for rank values: | ||
Line 128: | Line 128: | ||
When you think of scalar rating systems, we'd be surprised if-in your mind-you're not seeing stars. Rating systems of 3, 4, and 5 stars abound on the Web and have achieved a level of semipermanence in reputation systems. Perhaps that's because of the ease with which users can engage with star ratings-choosing a number of stars is a nice way to express an opinion beyond simple like or dislike. | When you think of scalar rating systems, we'd be surprised if-in your mind-you're not seeing stars. Rating systems of 3, 4, and 5 stars abound on the Web and have achieved a level of semipermanence in reputation systems. Perhaps that's because of the ease with which users can engage with star ratings-choosing a number of stars is a nice way to express an opinion beyond simple like or dislike. | ||
- | More generally, a scalar value is a type of reputation claim in which a user gives an entity a "grade" somewhere along a bounded spectrum. The spectrum may be finely delineated and allow for many gradations of opinion (10-star ratings are not unheard of), or it may be binary (for example, thumbs-up, thumbs-down.) | + | More generally, a scalar value is a type of reputation claim in which a user gives an entity a “grade” somewhere along a bounded spectrum. The spectrum may be finely delineated and allow for many gradations of opinion (10-star ratings are not unheard of), or it may be binary (for example, thumbs-up, thumbs-down.) |
* Star ratings (3-, 4-, and 5-star scales are common) | * Star ratings (3-, 4-, and 5-star scales are common) | ||
* Letter grade (A, B, C, D, F) | * Letter grade (A, B, C, D, F) | ||
- | * Novelty-type themes ("4 out of 5 cupcakes") | + | * Novelty-type themes (“4 out of 5 cupcakes” ) |
Yahoo! Movies features letter grades for reviews. The overall grades calculated using a combination of professional reviewers' scores (which are transformed from a whole host of different claim types, from the //New York Times// letter-grade style to the classic Siskel and Ebert thumbs-up. thumbs-down format) and Yahoo! user reviews, which are gathered on a 5-star system. | Yahoo! Movies features letter grades for reviews. The overall grades calculated using a combination of professional reviewers' scores (which are transformed from a whole host of different claim types, from the //New York Times// letter-grade style to the classic Siskel and Ebert thumbs-up. thumbs-down format) and Yahoo! user reviews, which are gathered on a 5-star system. | ||
Line 150: | Line 150: | ||
** Simple Counter ** | ** Simple Counter ** | ||
- | A simple counter roll-up adds one to a stored numeric claim representing all the times that the process received any input. | + | A simple counter roll-up, <html><a href="#Figure_3-2">Figure_3-2</a> </html>,adds one to a stored numeric claim representing all the times that the process received any input. |
- | <html><a name="Figure_3-2"><center></html>// Figure_3-2: A Simple Counter process just counts inputs //<html></center></a></html> | + | <html><a name="Figure_3-2"><center></html>// Figure_3-2: A Simple Counter process does just what you'd expect-as inputs come in, it counts them and stores the result. //<html></center></a></html> |
- | <html><center><img width="80%" src="http://buildingreputation.com/lib/exe/fetch.php?media=Ch04-SimpleCounter.png"/></center></html> | + | <html><center><img width="80%" src="http://buildingreputation.com/lib/exe/fetch.php?media=Figure_3-2.png"/></center></html> |
A simple counter roll-up ignores any supplied claim value. Once it receives the input message, it reads (or creates) and adds one to the ''CountOfInputs'' , which is stored as the claim value for this process. | A simple counter roll-up ignores any supplied claim value. Once it receives the input message, it reads (or creates) and adds one to the ''CountOfInputs'' , which is stored as the claim value for this process. | ||
Line 170: | Line 170: | ||
Like a simple counter roll-up, a reversible counter roll-up ignores any supplied claim value. Once it receives the input message, it either adds or subtracts one to a stored numeric claim depending on whether or not there is already a stored claim for this source and target. | Like a simple counter roll-up, a reversible counter roll-up ignores any supplied claim value. Once it receives the input message, it either adds or subtracts one to a stored numeric claim depending on whether or not there is already a stored claim for this source and target. | ||
- | Reversible counters are useful when there is a high probability of abuse (because of commercial incentive benefits such as contests - See <html><a href="/doku.php?id=Chapter_5#Chap_5-Commercial_Incentives">Chap_5-Commercial_Incentives</a> </html>) or when you anticipate the need to rescind inputs by users or the application for other reasons. | + | Reversible counters, <html><a href="#Figure_3-3">Figure_3-3</a> </html>, are useful when there is a high probability of abuse (perhaps because of commercial incentive benefits such as contests-see <html><a href="/doku.php?id=Chapter_5#Chap_5-Commercial_Incentives">Chap_5-Commercial_Incentives</a> </html>) or when you anticipate the need to rescind inputs by users or the application for other reasons. |
- | <html><a name="Figure_3-3"><center></html>// Figure_3-3: A Reversible Counter process remembers inputs so they me be undone //<html></center></a></html> | + | <html><a name="Figure_3-3"><center></html>// Figure_3-3: A Reversible Counter also counts incoming inputs, but it also remembers them, that they (and their effects) may be undone later. Trust us, this can be very useful. //<html></center></a></html> |
- | <html><center><img width="80%" src="http://buildingreputation.com/lib/exe/fetch.php?media=Ch04-ReversibleCounter.png"/></center></html> | + | <html><center><img width="80%" src="http://buildingreputation.com/lib/exe/fetch.php?media=Figure_3-3.png"/></center></html> |
Here are pros and cons of using a reversible counter roll-up: | Here are pros and cons of using a reversible counter roll-up: | ||
Line 189: | Line 189: | ||
** Simple Accumulator ** | ** Simple Accumulator ** | ||
- | A simple accumulator roll-up adds a single numeric input value to a running sum that is stored in a reputation statement. | + | A simple accumulator, <html><a href="#Figure_3-4">Figure_3-4</a> </html>, roll-up adds a single numeric input value to a running sum that is stored in a reputation statement. |
- | <html><a name="Figure_3-4"><center></html>// Figure_3-4: A Simple Accumulator process adds arbitrary amounts //<html></center></a></html> | + | <html><a name="Figure_3-4"><center></html>// Figure_3-4: A Simple Accumulator process adds arbitrary amounts and stores the sum. //<html></center></a></html> |
- | <html><center><img width="80%" src="http://buildingreputation.com/lib/exe/fetch.php?media=Ch04-SimpleAccumulator.png"/></center></html> | + | <html><center><img width="80%" src="http://buildingreputation.com/lib/exe/fetch.php?media=Figure_3-4.png"/></center></html> |
Here are pros and cons of using a simple accumulator roll-up: | Here are pros and cons of using a simple accumulator roll-up: | ||
Line 206: | Line 206: | ||
** Reversible Accumulator ** | ** Reversible Accumulator ** | ||
- | A reversible accumulator roll-up either (1) stores and adds a new input value to a running sum or (2) undoes the effects of a previous addition. Consider using a <html><a href="/doku.php?id=Chapter_3#Chap_3-Reversible_Accumulator">Chap_3-Reversible_Accumulator</a> </html>if you would otherwise use a <html><a href="/doku.php?id=Chapter_3#Chap_3-Simple_Accumulator">Chap_3-Simple_Accumulator</a> </html>but you want the option either to review how individual sources are contributing to the ''Sum'' or to be able to undo the effects of buggy software or abusive use. However, if you expect a very large amount of traffic, you may want to stick with a <html><a href="/doku.php?id=Chapter_3#Chap_3-Simple_Accumulator">Chap_3-Simple_Accumulator</a> </html>: storing a reputation statement for every contribution can be prohibitively database intensive if traffic is high. | + | A reversible accumulator roll-up, <html><a href="#Figure_3-5">Figure_3-5</a> </html>, either (1) stores and adds a new input value to a running sum or (2) undoes the effects of a previous addition. Consider using a <html><a href="/doku.php?id=Chapter_3#Chap_3-Reversible_Accumulator">Chap_3-Reversible_Accumulator</a> </html>if you would otherwise use a <html><a href="/doku.php?id=Chapter_3#Chap_3-Simple_Accumulator">Chap_3-Simple_Accumulator</a> </html>but you want the option either to review how individual sources are contributing to the ''Sum'' or to be able to undo the effects of buggy software or abusive use. However, if you expect a very large amount of traffic, you may want to stick with a <html><a href="/doku.php?id=Chapter_3#Chap_3-Simple_Accumulator">Chap_3-Simple_Accumulator</a> </html>: storing a reputation statement for every contribution can be prohibitively database intensive if traffic is high. |
- | <html><a name="Figure_3-5"><center></html>// Figure_3-5: A Reversible Accumulator process remembers inputs so they me be undone //<html></center></a></html> | + | <html><a name="Figure_3-5"><center></html>// Figure_3-5: A Reversible Accumulator process improves on the Simple model-it remembers inputs so they may be undone. //<html></center></a></html> |
- | <html><center><img width="80%" src="http://buildingreputation.com/lib/exe/fetch.php?media=Ch04-ReversibleAccumulator.png"/></center></html> | + | <html><center><img width="80%" src="http://buildingreputation.com/lib/exe/fetch.php?media=Figure_3-5.png"/></center></html> |
Here are pros and cons of using a reversible accumulator roll-up: | Here are pros and cons of using a reversible accumulator roll-up: | ||
Line 223: | Line 223: | ||
** Simple Average ** | ** Simple Average ** | ||
- | A simple average roll-up calculates and stores a running average including new input. | + | A simple average roll-up, <html><a href="#Figure_3-6">Figure_3-6</a> </html>, calculates and stores a running average including new input. |
- | <html><a name="Figure_3-6"><center></html>// Figure_3-6: A Simple Average process keeps a running total and count for incremental calculations //<html></center></a></html> | + | <html><a name="Figure_3-6"><center></html>// Figure_3-6: A Simple Average process keeps a running total and count for incremental calculations. //<html></center></a></html> |
- | <html><center><img width="80%" src="http://buildingreputation.com/lib/exe/fetch.php?media=Ch04-SimpleAverage.png"/></center></html> | + | <html><center><img width="80%" src="http://buildingreputation.com/lib/exe/fetch.php?media=Figure_3-6.png"/></center></html> |
The simple average roll-up is probably the most common reputation score basis. It calculates the mathematical mean of a series of the history of inputs. Its components are a ''SumOfInputs'' , ''CountOfInputs'' , and the process claim value:''AvgOfInputs'' . | The simple average roll-up is probably the most common reputation score basis. It calculates the mathematical mean of a series of the history of inputs. Its components are a ''SumOfInputs'' , ''CountOfInputs'' , and the process claim value:''AvgOfInputs'' . | ||
Line 240: | Line 240: | ||
** Reversible Average ** | ** Reversible Average ** | ||
- | A reversible average is a reversible version of simple average-it keeps a reputation statement for each input and optionally uses it to reverse the effects of the input. | + | A reversible average, <html><a href="#Figure_3-7">Figure_3-7</a> </html>, is a reversible version of simple average-it keeps a reputation statement for each input and optionally uses it to reverse the effects of the input. |
- | <html><a name="Figure_3-7"><center></html>// Figure_3-7: A Reversible Average process remembers inputs so they me be undone //<html></center></a></html> | + | <html><a name="Figure_3-7"><center></html>// Figure_3-7: A Reversible Average process remembers inputs so they may be undone. //<html></center></a></html> |
- | <html><center><img width="80%" src="http://buildingreputation.com/lib/exe/fetch.php?media=Ch04-ReversibleAverage.png"/></center></html> | + | <html><center><img width="80%" src="http://buildingreputation.com/lib/exe/fetch.php?media=Figure_3-7.png"/></center></html> |
If a previous input exists for this context, the reversible average operation reverses it: the previously stored claim value is removed to from the AverageOfInputs, the CountOfInputs is decremented, and the source's reputation statement is destroyed. If there is no previous input for this context, compute a simple average (see <html><a href="/doku.php?id=Chapter_3#Chap_3-Simple_Average">Chap_3-Simple_Average</a> </html>) and store the input claim value in a reputation statement by this source for the target with this context. | If a previous input exists for this context, the reversible average operation reverses it: the previously stored claim value is removed to from the AverageOfInputs, the CountOfInputs is decremented, and the source's reputation statement is destroyed. If there is no previous input for this context, compute a simple average (see <html><a href="/doku.php?id=Chapter_3#Chap_3-Simple_Average">Chap_3-Simple_Average</a> </html>) and store the input claim value in a reputation statement by this source for the target with this context. | ||
Line 259: | Line 259: | ||
** Mixer ** | ** Mixer ** | ||
- | A mixer roll-up combines two or more inputs or read values into a single score according to a weighting or mixing formula. It's preferable, but not required, to normalize the input and output values. Mixers perform most of the custom calculations in complex reputation models. | + | A mixer roll-up, <html><a href="#Figure_3-8">Figure_3-8</a> </html>, combines two or more inputs or read values into a single score according to a weighting or mixing formula. It's preferable, but not required, to normalize the input and output values. Mixers perform most of the custom calculations in complex reputation models. |
- | <html><a name="Figure_3-8"><center></html>// Figure_3-8: A Mixer process combines multiple inputs by storing previous inputs //<html></center></a></html> | + | <html><a name="Figure_3-8"><center></html>// Figure_3-8: A Mixer combines multiple inputs together and weights each. //<html></center></a></html> |
- | <html><center><img width="80%" src="http://buildingreputation.com/lib/exe/fetch.php?media=Ch04-Mixer.png"/></center></html> | + | <html><center><img width="80%" src="http://buildingreputation.com/lib/exe/fetch.php?media=Figure_3-8.png"/></center></html> |
<html><a name='Chap_3-Simple_Ratio'></a></html> | <html><a name='Chap_3-Simple_Ratio'></a></html> | ||
** Simple Ratio ** | ** Simple Ratio ** | ||
- | A simple ratio roll-up counts the number of inputs (the total), separately counts the number of times the input has a value of exactly 1.0 (for example, hits), and stores the result as a text claim value of "(hits) out of (total)". | + | A simple ratio roll-up, <html><a href="#Figure_3-9">Figure_3-9</a> </html>, counts the number of inputs (the total), separately counts the number of times the input has a value of exactly 1.0 (for example, hits), and stores the result as a text claim with the value of “(hits) out of (total)” . |
- | <html><a name="Figure_3-9"><center></html>// Figure_3-9: A Simple Ratio process keeps running sums and counts //<html></center></a></html> | + | <html><a name="Figure_3-9"><center></html>// Figure_3-9: A Simple Ratio process keeps running sums and counts. //<html></center></a></html> |
- | <html><center><img width="80%" src="http://buildingreputation.com/lib/exe/fetch.php?media=Ch04-SimpleRatio.png"/></center></html> | + | <html><center><img width="80%" src="http://buildingreputation.com/lib/exe/fetch.php?media=Figure_3-9.png"/></center></html> |
<html><a name='Chap_3-Reversible_Ratio'></a></html> | <html><a name='Chap_3-Reversible_Ratio'></a></html> | ||
** Reversible Ratio ** | ** Reversible Ratio ** | ||
- | If the source already has a stored input value for a target, a reversible ratio roll-up reverses the effect of the previous hit. Otherwise, this roll-up counts the total number of inputs (the total) and separately counts the number of times the input has a value of exactly 1.0 (hits). It stores the result as a text claim value of "(hits) out of (total)" and also stores the source's input value as a reputation statement for possible reversal and retrieval. | + | If the source already has a stored input value for a target, a reversible ratio roll-up, <html><a href="#Figure_3-10">Figure_3-10</a> </html>, reverses the effect of the previous hit. Otherwise, this roll-up counts the total number of inputs (the total) and separately counts the number of times the input has a value of exactly 1.0 (hits). It stores the result as a text claim value of “(hits) out of (total)” and also stores the source's input value as a reputation statement for possible reversal and retrieval. |
- | <html><a name="Figure_3-10"><center></html>// Figure_3-10: A Reversible Ratio process remembers inputs so they me be undone //<html></center></a></html> | + | <html><a name="Figure_3-10"><center></html>// Figure_3-10: A Reversible Ratio process remembers inputs so they may be undone. //<html></center></a></html> |
- | <html><center><img width="80%" src="http://buildingreputation.com/lib/exe/fetch.php?media=Ch04-ReversibleRatio.png"/></center></html> | + | <html><center><img width="80%" src="http://buildingreputation.com/lib/exe/fetch.php?media=Figure_3-10.png"/></center></html> |
== Transformers: Data Normalization == | == Transformers: Data Normalization == | ||
- | Data transformation is essential in complex reputation systems, in which information enters a model in many different forms. For example, consider an IP address reputation model that accepts this-email-is-spam votes from users along side incoming traffic rates to the mail server as well as a historical karma score for the user submitting the vote, each value must be transformed into a common numerical range before being combined. It may be useful to represent the result in a discrete Spammer/DoNotKnowIfSpammer/NotSpammer category. In this example, transformation processes do both the normalization and de-normalization. | + | Data transformation is essential in complex reputation systems, in which information enters a model in many different forms. For example, consider an IP address reputation model for a mail system: perhaps it accepts this-email-is-spam votes from users; alongside incoming traffic rates to the mail server; as well as a historical karma score for the user submitting the vote. Each of these values must be transformed into a common numerical range before being combined. |
- | <html><a name="Figure_3-11"><center></html>// Figure_3-11: Transformers normalized/denormalize data and are not usually independent processes //<html></center></a></html> | + | Furthermore, it may be useful to represent the result in a discrete ''Spammer/DoNotKnowIfSpammer/NotSpammer'' category. In this example, transformation processes, <html><a href="#Figure_3-11">Figure_3-11</a> </html>, do both the normalization and de-normalization. |
- | <html><center><img width="80%" src="http://buildingreputation.com/lib/exe/fetch.php?media=Ch04-Normalization.png"/></center></html> | + | |
+ | <html><a name="Figure_3-11"><center></html>// Figure_3-11: Transformers normalize and de-normalize data. They are not usually independent processes. //<html></center></a></html> | ||
+ | <html><center><img width="80%" src="http://buildingreputation.com/lib/exe/fetch.php?media=Figure_3-11.png"/></center></html> | ||
Line 320: | Line 322: | ||
** Simple Evaluator ** | ** Simple Evaluator ** | ||
- | A simple evaluator process provides the basic "If . . . then . . ." statement of reputation models. Usually comparing two inputs and sends a message on to another process(es). Remember that the inputs may arrive asynchronously and separately, so the evaluator may need to have its own state. | + | A simple evaluator process provides the basic “If … then …” statement of reputation models. Usually comparing two inputs and sends a message on to another process(es). Remember that the inputs may arrive asynchronously and separately, so the evaluator may need to have its own state. |
Line 330: | Line 332: | ||
** Message Splitter ** | ** Message Splitter ** | ||
- | A message splitter replicates a message and forwards it to more than one model event process. This operation starts multiple simultaneous execution paths for one reputation model, depending on the specific characteristics of the reputation sandbox implementation. See <html><a href="/doku.php?id=Appendix_A">Appendix_A</a> </html>. | + | A message splitter, <html><a href="#Figure_3-12">Figure_3-12</a> </html>, replicates a message and forwards it to more than one model event process. This operation starts multiple simultaneous execution paths for one reputation model, depending on the specific characteristics of the reputation framework implementation. See <html><a href="/doku.php?id=Appendix_A">Appendix_A</a> </html>. |
- | <html><a name="Figure_3-12"><center></html>// Figure_3-12: Message Splitters are represented by forked lines //<html></center></a></html> | + | <html><a name="Figure_3-12"><center></html>// Figure_3-12: A message coming from a process may split and feed into two or more downstream processes. //<html></center></a></html> |
- | <html><center><img width="80%" src="http://buildingreputation.com/lib/exe/fetch.php?media=Ch04-Splitter.png"/></center></html> | + | <html><center><img width="80%" src="http://buildingreputation.com/lib/exe/fetch.php?media=Figure_3-12.png"/></center></html> |
** Conjoint Message Delivery ** | ** Conjoint Message Delivery ** | ||
- | Conjoint message delivery describes the pattern of messages from multiple different input sources being delivered to one process which treats them as if they all have the exact same meaning. For example, in a very large-scale system, multiple servers may send reputation input messages to a shared reputation system environment reporting on user actions: it doesn't matter which server sent the message; the reputation model treats them all the same way. This is drawn as two message lines joining into one input on the left side of the process box. | + | Conjoint message delivery, <html><a href="#Figure_3-13">Figure_3-13</a> </html>, describes the pattern of messages from multiple different input sources being delivered to one process which treats them as if they all have the exact same meaning. For example, in a very large-scale system, multiple servers may send reputation input messages to a shared reputation system environment reporting on user actions: it doesn't matter which server sent the message; the reputation model treats them all the same way. This is drawn as two message lines joining into one input on the left side of the process box. |
- | <html><a name="Figure_3-13"><center></html>// Figure_3-13: Conjoined message paths are represented by merging lines //<html></center></a></html> | + | <html><a name="Figure_3-13"><center></html>// Figure_3-13: Conjoint message paths are represented by merging lines. These two different kinds of inputs will be evaluated in exactly the same way. //<html></center></a></html> |
- | <html><center><img width="80%" src="http://buildingreputation.com/lib/exe/fetch.php?media=Ch04-Conjoint.png"/></center></html> | + | <html><center><img width="80%" src="http://buildingreputation.com/lib/exe/fetch.php?media=Figure_3-13.png"/></center></html> |
Line 360: | Line 362: | ||
** Periodic Inputs ** | ** Periodic Inputs ** | ||
- | Sometimes reputation models are activated on the basis of an input that's not reputation based, such as a timer that will perform an external data transform. At present, this grammar provides no explicit mechanism for reputation models to spontaneously wake up and begin executing, and this has an effect on mechanisms such as those detailed in <html><a href="/doku.php?id=Chapter_3#Chap_3-Decay">Chap_3-Decay</a> </html>. So far, in the authors' experience, spontaneous reputation model activation is not necessary and keeping this constraint out has simplified high-performance implementations. Though there is no particular universal requirement for this limitation. | + | Sometimes reputation models are activated on the basis of an input that's not reputation based, such as a timer that will perform an external data transform. At present, this grammar provides no explicit mechanism for reputation models to spontaneously wake up and begin executing, and this has an effect on mechanisms such as those detailed in <html><a href="/doku.php?id=Chapter_3#Chap_3-Decay">Chap_3-Decay</a> </html>. So far, in the authors' experience, spontaneous reputation model activation is not necessary and keeping this constraint out has simplified high-performance implementations. However, there is no particular universal requirement for this limitation. |
Line 369: | Line 371: | ||
** Return Values ** | ** Return Values ** | ||
- | Simple reputation environments, in which all the model is implemented as serially executed in-line with the actual input actions, are usually implemented using on request-reply semantics: The reputation model runs for exactly one input at a time and runs until it terminates by returning a copy of the roll-up value that it calculated. Large-scale, asynchronous reputation sandboxes, such as that described in <html><a href="/doku.php?id=Appendix_A">Appendix_A</a> </html>, don't return results in this way. Instead, they terminate silently and sometimes send signals (see below). | + | Simple reputation environments, in which all the model is implemented as serially executed in-line with the actual input actions, are usually implemented using on request-reply semantics: The reputation model runs for exactly one input at a time and runs until it terminates by returning a copy of the roll-up value that it calculated. Large-scale, asynchronous reputation frameworks, such as that described in <html><a href="/doku.php?id=Appendix_A">Appendix_A</a> </html>, don't return results in this way. Instead, they terminate silently and sometimes send signals (see below). |
<html><a name='Chap_3-Signals'></a></html> | <html><a name='Chap_3-Signals'></a></html> | ||
- | ** Signals: Breaking Out of the Reputation Sandbox ** | + | ** Signals: Breaking Out of the Reputation Framework ** |
- | Sometimes a reputation model needs to notify the application environment that something significant has happened and special handling is required. To accomplish this, the process sends a //signal//: a message that breaks out of the reputation sandbox. The mechanism of signaling is specific to each sandbox implementation, but in our diagramming grammar, signaling is always represented by an arrow leaving the box. | + | Sometimes a reputation model needs to notify the application environment that something significant has happened and special handling is required. To accomplish this, the process sends a //signal//: a message that breaks out of the reputation framework. The mechanism of signaling is specific to each framework implementation, but in our diagramming grammar, signaling is always represented by an arrow leaving the box. |
Line 383: | Line 385: | ||
<html><a name='Chap_3-Craftsman_Tips'></a></html> | <html><a name='Chap_3-Craftsman_Tips'></a></html> | ||
==== Practitioner's Tips: Reputation Is Tricky ==== | ==== Practitioner's Tips: Reputation Is Tricky ==== | ||
- | When you begin designing a reputation model and system using our graphical grammar, it may be tempting to take elements of the grammar and just plug them together in the simplest possible combinations to create an Amazon-like rating and review system, or a Digg-like voting model, or even a points-based karma incentive model as on StackOverflow. In practice-"in the wild," where people with myriad personal incentives interact with them both as sources of reputation and as consumers-the implementation of reputation systems is fraught with peril. In this section, we describe several pitfalls to avoid in designing reputation models. | + | When you begin designing a reputation model and system using our graphical grammar, it may be tempting to take elements of the grammar and just plug them together in the simplest possible combinations to create an Amazon-like rating and review system, or a Digg-like voting model, or even a points-based karma incentive model as on StackOverflow. In practice-“in the wild,” where people with myriad personal incentives interact with them both as sources of reputation and as consumers-the implementation of reputation systems is fraught with peril. In this section, we describe several pitfalls to avoid in designing reputation models. |
<html><a name='Chap_3-Power_and_Costs_of_Normalization'></a></html> | <html><a name='Chap_3-Power_and_Costs_of_Normalization'></a></html> | ||
Line 394: | Line 396: | ||
* **Normalized Values Are Portable (Messages and Data Sharing)** | * **Normalized Values Are Portable (Messages and Data Sharing)** | ||
- | * Probably the most compelling reason to normalize the claim values in your reputation statements and messages is that normalized data is portable across various display contexts (see <html><a href="/doku.php?id=Chapter_7">Chapter_7</a> </html>) and can reuse any of the roll-up process code in your reputation sandbox that accepts and outputs normalized values. Other applications will not require special understanding of your claim values to interpret them. | + | * Probably the most compelling reason to normalize the claim values in your reputation statements and messages is that normalized data is portable across various display contexts (see <html><a href="/doku.php?id=Chapter_7">Chapter_7</a> </html>) and can reuse any of the roll-up process code in your reputation framework that accepts and outputs normalized values. Other applications will not require special understanding of your claim values to interpret them. |
* **Normalized Values Are Easy to Transform (Denormalize)** | * **Normalized Values Are Easy to Transform (Denormalize)** | ||
- | * The most common representation of the average of scalar inputs is a percentage - and this denormalization is accomplished by multiplying the normalized it by 100. Any normalized score may be scalar value by using a table or, if the conversion is linear, by performing a simple multiplication. For example, converting to a 5-star rating system could be as simple as multiplying by 20. Normalization also allows input of just one type, such as thumbs-up (1) or thumbs-down (0), but the normalized aggregate result may end up being represented as a percentage (0%-100%) or turned into a 3-point scale of thumbs-up (0.66-1.0), thumbs-down (0.0-0.33), or thumb-to-side (0.33-0.66). Using a normalized score allows this conversion to take place at display time without committing the converted value to the database. Also, the exact same values can be denormalized by different applications with completely different needs. | + | * The most common representation of the average of scalar inputs is a percentage - and this denormalization is accomplished trivially by multiplying the normalized value by 100. Any normalized score may be transformed into a scalar value by using a table or, if the conversion is linear, by performing a simple multiplication. For example, converting to a 5-star rating system could be as simple as multiplying rating by 0.20 to get the normalized score. To get the stars back, just multiply by 5.0. |
+ | Normalization also the values of any claim type, such as thumbs-up (1.0)/thumbs-down (0.0), to be denormalized as a different claim type, such as a percentage (0%-100%) or turned into a 3-point scale of thumbs-up (0.66-1.0), thumbs-down (0.0-0.33), or thumb-to-side (0.33-0.66). Using a normalized score allows this conversion to take place at display time without committing the converted value to the database. Also, the exact same values can be denormalized by different applications with completely different needs. | ||
As with all things, the power of normalization comes with some costs. | As with all things, the power of normalization comes with some costs. | ||
Line 404: | Line 407: | ||
* Using different normalized numbers in large reputation systems can cause unexpected biases when the original claim types were scalar values with slightly different ranges. Averaging normalized maximum 4-star ratings (25% each) with maximum 5-star ratings (20% each) leads to rounding errors that cause the scores to clump up if the average is denormalized back to 5 stars. See the example table <html><a href="#Table_4-1">Table_4-1</a> </html>. | * Using different normalized numbers in large reputation systems can cause unexpected biases when the original claim types were scalar values with slightly different ranges. Averaging normalized maximum 4-star ratings (25% each) with maximum 5-star ratings (20% each) leads to rounding errors that cause the scores to clump up if the average is denormalized back to 5 stars. See the example table <html><a href="#Table_4-1">Table_4-1</a> </html>. | ||
<html><a name="Table_4-1"><center></html>// Table_4-1: An example of ugly side-effects when normalizing/denormalizing across different scales //<html></center></a><table align="center" border="1" class="inline"><thead><tr><td align="center">Scale</td><td align="center">1 Stars Normalized</td><td align="center">2 Stars Normalized</td><td align="center">3 Stars Normalized</td><td align="center">4 Stars Normalized</td><td align="center">5 Stars Normalized</td></tr></thead><tbody><tr><td align="center">4 stars</td><td align="center">25</td><td align="center">50</td><td align="center">51-75</td><td align="center">76-100</td><td align="center"> | <html><a name="Table_4-1"><center></html>// Table_4-1: An example of ugly side-effects when normalizing/denormalizing across different scales //<html></center></a><table align="center" border="1" class="inline"><thead><tr><td align="center">Scale</td><td align="center">1 Stars Normalized</td><td align="center">2 Stars Normalized</td><td align="center">3 Stars Normalized</td><td align="center">4 Stars Normalized</td><td align="center">5 Stars Normalized</td></tr></thead><tbody><tr><td align="center">4 stars</td><td align="center">25</td><td align="center">50</td><td align="center">51-75</td><td align="center">76-100</td><td align="center"> | ||
- | //N/A// | + | N/A |
- | </td></tr><tr><td align="center">5 stars</td><td align="center">20</td><td align="center">40</td><td align="center">41-60</td><td align="center">61-80</td><td align="center">81-100</td></tr><tr><td align="center">Averaged range / Denormalized</td><td align="center">0-22 / {{:full_star.gif}}{{:empty_star.gif}}{{:empty_star.gif}}{{:empty_star.gif}}{{:empty_star.gif}}</td><td align="center">23-45 / {{:full_star.gif}}{{:full_star.gif}}{{:empty_star.gif}}{{:empty_star.gif}}{{:empty_star.gif}}</td><td align="center">46-67 / {{:full_star.gif}}{{:full_star.gif}}{{:full_star.gif}}{{:empty_star.gif}}{{:empty_star.gif}}</td><td align="center">68-90 / {{:full_star.gif}}{{:full_star.gif}}{{:full_star.gif}}{{:full_star.gif}}{{:empty_star.gif}}</td><td align="center">78-100 / {{:full_star.gif}}{{:full_star.gif}}{{:full_star.gif}}{{:full_star.gif}}{{:full_star.gif}}</td></tr></tbody></table></html> | + | </td></tr><tr><td align="center">5 stars</td><td align="center">20</td><td align="center">40</td><td align="center">41-60</td><td align="center">61-80</td><td align="center">81-100</td></tr><tr><td align="center">Averaged range / Denormalized</td><td align="center">0-22 /<br><img src="/figs/incoming/full_star.gif" alt=""><img src="/figs/incoming/empty_star.gif" alt=""><img src="/figs/incoming/empty_star.gif" alt=""><img src="/figs/incoming/empty_star.gif" alt=""><img src="/figs/incoming/empty_star.gif" alt=""></td><td align="center">23-45 /<br><img src="/figs/incoming/full_star.gif" alt=""><img src="/figs/incoming/full_star.gif" alt=""><img src="/figs/incoming/empty_star.gif" alt=""><img src="/figs/incoming/empty_star.gif" alt=""><img src="/figs/incoming/empty_star.gif" alt=""></td><td align="center">46-67 /<br><img src="/figs/incoming/full_star.gif" alt=""><img src="/figs/incoming/full_star.gif" alt=""><img src="/figs/incoming/full_star.gif" alt=""><img src="/figs/incoming/empty_star.gif" alt=""><img src="/figs/incoming/empty_star.gif" alt=""></td><td align="center">68-90 /<br><img src="/figs/incoming/full_star.gif" alt=""><img src="/figs/incoming/full_star.gif" alt=""><img src="/figs/incoming/full_star.gif" alt=""><img src="/figs/incoming/full_star.gif" alt=""><img src="/figs/incoming/empty_star.gif" alt=""></td><td align="center">78-100 /<br><img src="/figs/incoming/full_star.gif" alt=""><img src="/figs/incoming/full_star.gif" alt=""><img src="/figs/incoming/full_star.gif" alt=""><img src="/figs/incoming/full_star.gif" alt=""><img src="/figs/incoming/full_star.gif" alt=""></td></tr></tbody></table></html> |
Line 430: | Line 433: | ||
''RankMean'' | ''RankMean'' | ||
** | ** | ||
- | * ''r = SimpleMean m - AdjustmentFactor a + LiquidityWeight l * Adjustment Factor a'' | + | * |
+ | ''r = SimpleMean m - AdjustmentFactor a + LiquidityWeight l * Adjustment Factor a'' | ||
* ** | * ** | ||
''LiquidityWeight'' | ''LiquidityWeight'' | ||
** | ** | ||
- | * ''l = min(max((NumRatings n - LiquidityFloor f) / LiquidityCeiling c, 0), 1) * 2'' | + | * |
+ | ''l = min(max((NumRatings n - LiquidityFloor f) / LiquidityCeiling c, 0), 1) * 2'' | ||
* **Or** | * **Or** | ||
- | * ''r = m - a + min(max((n - f) / c, 0.00), 1.00) * 2.00 * a'' | + | * |
- | This formula produces a curve seen in <html><a href="#Figure_3-1">Figure_3-1</a> </html>. Though a more mathematically continuous curve might seem appropriate, this linear approximation can be done with simple nonrecursive calculations and requires no knowledge of previous individual inputs. | + | ''r = m - a + min(max((n - f) / c, 0.00), 1.00) * 2.00 * a'' |
+ | This formula produces a curve like that in <html><a href="#Figure_3-14">Figure_3-14</a> </html>. Though a more mathematically continuous curve might seem appropriate, this linear approximation can be done with simple non-recursive calculations and requires no knowledge of previous individual inputs. | ||
- | <html><a name="Figure_3-14"><center></html>// Figure_3-14: The effects of the liquidity compensation algorithm //<html></center></a></html> | + | <html><a name="Figure_3-14"><center></html>// Figure_3-14: The effects of the liquidity compensation algorithm. //<html></center></a></html> |
- | <html><center><img width="50%" src="http://buildingreputation.com/lib/exe/fetch.php?media=Ch04-ReputationCurve.png"/></center></html> | + | <html><center><img width="50%" src="http://buildingreputation.com/lib/exe/fetch.php?media=Figure_3-14.png"/></center></html> |
Suggested initial values for ''a'' , ''c'' , and ''f'' (assuming normalized inputs): | Suggested initial values for ''a'' , ''c'' , and ''f'' (assuming normalized inputs): | ||
Line 472: | Line 478: | ||
<html><a name='Chap_3-Ratings_Bias_Effects'></a></html> | <html><a name='Chap_3-Ratings_Bias_Effects'></a></html> | ||
== Ratings Bias Effects == | == Ratings Bias Effects == | ||
- | <html><a name="Figure_3-15"><center></html>// Figure_3-15: Some ratings distributions on Yahoo! sites: "One of these things is not like the other. One of these things just doesn't belong." //<html></center></a></html> | + | <html><a name="Figure_3-15"><center></html>// Figure_3-15: Some real ratings distributions on Yahoo! sites. Only one of these distributions suggests a healthy, useful spread of ratings within a community. Can you spot it? //<html></center></a></html> |
- | <html><center><img width="70%" src="http://buildingreputation.com/lib/exe/fetch.php?media=Ch04-RatingsBiasCurves.png"/></center></html> | + | <html><center><img width="70%" src="http://buildingreputation.com/lib/exe/fetch.php?media=Figure_3-15.png"/></center></html> |
<html><a href="#Figure_3-15">Figure_3-15</a> </html>shows the graphs of 5-star ratings from nine different Yahoo! sites with all the volume numbers redacted. We don't need them, since we only want to talk about the shapes of the curves. | <html><a href="#Figure_3-15">Figure_3-15</a> </html>shows the graphs of 5-star ratings from nine different Yahoo! sites with all the volume numbers redacted. We don't need them, since we only want to talk about the shapes of the curves. | ||
Line 485: | Line 491: | ||
Most likely, the biggest difference was that Autos Custom users were rating //one another's// content. The other sites had users evaluating static, unchanging, or feed-based content in which they didn't have a vested interest. | Most likely, the biggest difference was that Autos Custom users were rating //one another's// content. The other sites had users evaluating static, unchanging, or feed-based content in which they didn't have a vested interest. | ||
- | In fact, if you look at the curves for Shopping and Local, they are practically identical, and have the flattest J-hook, giving the lowest share of 1-star ratings. This similarity was a direct result of the overwhelming use pattern for those sites. Users come to find a great place to eat or the best vacuum to buy. When they search, the results with the //highest ratings appear first//. If a user has experienced that place or thing, he may well also rate it-if it's easy to do so-and most likely will give it 5 stars (see <html><a href="/doku.php?id=Chapter_3#Chap_3-First_Mover_Effects">Chap_3-First_Mover_Effects</a> </html>). If the user sees an object that isn't rated but that he likes, he may also rate and/or review it, usually giving it 5 stars so that others can share his discovery-otherwise, why bother? People don't think that it's worth the bother to seek out and create Internet ratings for mediocre places or things. The curves, then, are the direct result of a product design intersecting with users' goals. This pattern-"I'm looking for good things, so I'll help others find good things" -- is a prevalent form of ratings bias. An even stronger example happens when users are asked to rate episodes of TV shows. They rate every episode 4.5 stars plus or minus .5 star because //only the fans bother to rate the episodes//, and no fan is ever going to rate an episode below a 3. Look at any popular current TV show on Yahoo! TV or [another site]. | + | In fact, if you look at the curves for Shopping and Local, they are practically identical, and have the flattest J-hook, giving the lowest share of 1-star ratings. This similarity was a direct result of the overwhelming use pattern for those sites. Users come to find a great place to eat or the best vacuum to buy. When they search, the results with the //highest ratings appear first//. If a user has experienced that place or thing, he may well also rate it-if it's easy to do so-and most likely will give it 5 stars (see <html><a href="/doku.php?id=Chapter_3#Chap_3-First_Mover_Effects">Chap_3-First_Mover_Effects</a> </html>). If the user sees an object that isn't rated but that he likes, he may also rate and/or review it, usually giving it 5 stars so that others can share his discovery-otherwise, why bother? People don't think that it's worth the bother to seek out and create Internet ratings for mediocre places or things. The curves, then, are the direct result of a product design intersecting with users' goals. This pattern-“I'm looking for good things, so I'll help others find good things” -is a prevalent form of ratings bias. An even stronger example happens when users are asked to rate episodes of TV shows. They rate every episode 4.5 stars plus or minus .5 star because //only the fans bother to rate the episodes//, and no fan is ever going to rate an episode below a 3. Look at any popular current TV show on Yahoo! TV or [another site]. |
Our closer look at how Yahoo! Autos Custom ratings worked and how users were evaluating the content showed why 1-star ratings were given out so often: users gave feedback to other users to get them to change their behavior. Specifically, you would get one star if you (1) didn't upload a picture of your ride, or (2) uploaded a dealer stock photo of your ride. The site is Autos Custom, after all! Users reserved 5-star ratings for the best of the best. Ratings of 2 through 4 stars were actually used to evaluate the quality and completeness of the car's profile. Unlike all the sites graphed here, the 5-star scale truly represented a broad sentiment, and people worked to improve their scores. | Our closer look at how Yahoo! Autos Custom ratings worked and how users were evaluating the content showed why 1-star ratings were given out so often: users gave feedback to other users to get them to change their behavior. Specifically, you would get one star if you (1) didn't upload a picture of your ride, or (2) uploaded a dealer stock photo of your ride. The site is Autos Custom, after all! Users reserved 5-star ratings for the best of the best. Ratings of 2 through 4 stars were actually used to evaluate the quality and completeness of the car's profile. Unlike all the sites graphed here, the 5-star scale truly represented a broad sentiment, and people worked to improve their scores. | ||
- | One ratings curve isn't shown here: the U-curve, in which 1 star and 5 stars are disproportionately selected. Some highly controversial objects on Amazon are targets of this rating curve. Yahoo's now-defunct personal music service also saw this kind of curve when new music was introduced to established users: 1 star came to mean "Never play this song again" and 5 meant "More like this one, please." If you're seeing U-curves, consider that (1) users are telling you that something other than what you wanted to measure is important, and/or (2) you might need a different rating scale. | + | One ratings curve isn't shown here: the U-curve, in which 1 star and 5 stars are disproportionately selected. Some highly controversial objects on Amazon are targets of this rating curve. Yahoo's now-defunct personal music service also saw this kind of curve when new music was introduced to established users: 1 star came to mean “Never play this song again” and 5 meant “More like this one, please.” If you're seeing U-curves, consider that users may be telling you something other than what you wanted to measure (or you might need a different rating scale.) |
<html><a name='Chap_3-First_Mover_Effects'></a></html> | <html><a name='Chap_3-First_Mover_Effects'></a></html> | ||
Line 501: | Line 507: | ||
* **Discouraging New Contributors** | * **Discouraging New Contributors** | ||
- | * Take special care with systems that contain leaderboards (see <html><a href="/doku.php?id=Chapter_7#Chap_7-Leaderboard">Chap_7-Leaderboard</a> </html>) when they're used either for content or for users. Items displayed on leaderboards //tend to stay// on the leaderboards, because the more people who see those items and click, rate, and comment on them, the more who will follow suit, creating a self-sustaining feedback loop. This loop not only keeps newer items and users from breaking into the leaderboards, it discourages new users from even making the effort to participate by giving the impression that they are too late to influence the result in any significant way. Though this phenomenon applies to all reputation scores, even for digital cameras, it's particularly acute in the case of simple point-based karma systems, which give active users ever more points for activity so that leaders, over years of feverish activity, amass millions of points, making it mathematically impossible for new users to ever catch up. | + | * Take special care with systems that contain leaderboards (see <html><a href="/doku.php?id=Chapter_7#Chap_7-Leaderboard">Chap_7-Leaderboard</a> </html>) when they're used either for content or for users. Items displayed on leaderboards //tend to stay// on the leaderboards, because the more people who see those items and click, rate, and comment on them, the more who will follow suit, creating a self-sustaining feedback loop. |
+ | This loop not only keeps newer items and users from breaking into the leaderboards, it discourages new users from even making the effort to participate by giving the impression that they are too late to influence the result in any significant way. | ||
+ | Though this phenomenon applies to all reputation scores, even for digital cameras, it's particularly acute in the case of simple point-based karma systems, which give active users ever more points for activity so that leaders, over years of feverish activity, amass millions of points, making it mathematically impossible for new users to ever catch up. | ||
Line 509: | Line 517: | ||
As the <html><a href="/doku.php?id=Chapter_3#Chap_3-First_Mover_Effects">Chap_3-First_Mover_Effects</a> </html>shows, time leaches value from reputation, but there's also the simple problem of ratings becoming stale over time as their target reputable entities change or become unfashionable. Businesses change ownership, technology becomes obsolete, cultural mores shift. | As the <html><a href="/doku.php?id=Chapter_3#Chap_3-First_Mover_Effects">Chap_3-First_Mover_Effects</a> </html>shows, time leaches value from reputation, but there's also the simple problem of ratings becoming stale over time as their target reputable entities change or become unfashionable. Businesses change ownership, technology becomes obsolete, cultural mores shift. | ||
- | The key insight to dealing with this problem is to remember the expression "What did you do for me //this// week?" When you're considering how your reputation system will display reputation and use it indirectly to modify the experience of users, remember to account for time value. A common method for compensating for time in reputation values is to apply a decay function: subtract value from the older reputations as time goes on, at a rate that is appropriate to the context. For example, digital camera ratings for resolution should probably lose half their weight every year, whereas restaurant reviews should only lose 10% of their value in the same interval. | + | The key insight to dealing with this problem is to remember the expression “What did you do for me //this// week?” When you're considering how your reputation system will display reputation and use it indirectly to modify the experience of users, remember to account for time value. A common method for compensating for time in reputation values is to apply a decay function: subtract value from the older reputations as time goes on, at a rate that is appropriate to the context. For example, digital camera ratings for resolution should probably lose half their weight every year, whereas restaurant reviews should only lose 10% of their value in the same interval. |
Here are some specific algorithms for decaying a reputation score over time: | Here are some specific algorithms for decaying a reputation score over time: | ||
Line 524: | Line 532: | ||
* **Time-limited Recalculation** | * **Time-limited Recalculation** | ||
- | * This is the de facto method that most engineers use to present any information in an application: use all of the ratings in a time range from the database and compute the score just in time. This is the most costly method, because it involves always hitting the database to consider an aggregate reputation (say, for a ranked list of hotels), when 99% of the time the value is exactly the same as it was the last time it was calculated. This method also may throw away still contextually valid reputation. We recommend trying some of the higher-performance suggestions above. | + | * This is the de facto method that most engineers use to present any information in an application: use all of the ratings in a time range from the database and compute the score just in time. This is the most costly method, because it involves always hitting the database to recalculate an aggregate reputation (say, for a ranked list of hotels), when 99% of the time the resulting value is exactly the same as it was the previous iteration. This method also may throw away still contextually valid reputation. Performance and reliability are usually best served with the alternate approaches described above. |
=== Implementer's Notes === | === Implementer's Notes === | ||
- | The massive-scale Yahoo! Reputation Platform, detailed in <html><a href="/doku.php?id=Appendix_A">Appendix_A</a> </html>implemented the reputation building blocks, such as the accumulator, sum, and even rolling average, in both the reputation model execution engine and in the database layer. This division of labor provided important performance improvements because the read-modify-write lodging for stored reputation values are kept as close to the data store as possible. For small systems, it may be reasonable to keep the entire reputation system in memory at once, thus avoiding this complication. But be careful. If your site is as successful as you hope it might someday be, making an all-memory-based design may well come back to bite you, hard. | + | The massive-scale Yahoo! Reputation Platform, detailed in <html><a href="/doku.php?id=Appendix_A">Appendix_A</a> </html>implemented the reputation building blocks, such as the accumulator, sum, and even rolling average, in both the reputation model execution engine and in the database layer. This division of labor provided important performance improvements because the read-modify-write logic for stored reputation values are kept as close to the data store as possible. For small systems, it may be reasonable to keep the entire reputation system in memory at once, thus avoiding this complication. But be careful. If your site is as successful as you hope it might someday be, making an all-memory-based design may well come back to bite you, hard. |
+ | |||
+ | |||
+ | === Making Buildings From Blocks === | ||
+ | In this chapter, we extended the grammar by defining various reputation building blocks out of which hundreds of currently deployed reputation systems are built. We also shared tips about a few surprises we've encountered that emerge when these processes interact with real human beings. | ||
+ | |||
+ | In <html><a href="/doku.php?id=Chapter_4">Chapter_4</a> </html>we'll combine and customize these blocks to describe full-fledged reputation models and systems that are available on the web today. We look at a selection common of patterns including voting, points, and karma. We also review complex reputations, such as those at eBay and Flickr, in considerable detail. Diagramming these currently operational examples demonstrates the expressiveness of the grammar. And the lessons learned from their challenges provide important experiences to consider when designing new models. | ||