The expression reputation system describes a wide array of practices, technologies, and user interface elements. This chapter will help you understand this by providing a comprehensive lexicon of attributes, processes and presentation that we will use going forward to describe current systems and to define new ones.
Much of the terminology surrounding reputation in the current marketplace is inconsistent, confusing, and even contradictory depending on what site you visit or what expert you read. After evaluating and developing scores of online and offline reputation systems for more than 30 years, the authors have been able to identify many concepts and attributes in common between them - sufficient similarity to brave proposing a common lexicon and graphical grammar in order to build a shared foundation of understanding. The hope is that work will raise the bar of quality for all future reputation systems and to help prevent foreseeable errors that might sabotage otherwise successful deployments.
There are many terms associated with the to describe and analyze the relations between concepts and attributes in a specific domain of knowledge including modeling languages, meta-modeling, ontologies, taxonomies, grammars, and patterns. Though we like to call our a graphical grammar, this excerpt from the Wikipedia entry for meta-modeling describes what we're hoping to accomplish with this section, and indeed this entire book - a consistent and standard framework for describing reputation models and systems.
Meta-modeling
… a formalized specification of domain-specific notations … following a strict rule set.
When describing this reputation systems grammar - the common concepts, attributes, and methods involved - we will borrow metaphors from basic chemistry: atoms [reputation statements] and their constituent particles [sources, claims, targets] are bound with forces [messages and processes] to make up molecules [reputation models], which comprise the core useful substances in the universe. Sometimes these molecules are mixed with others in solutions [reputation systems] to create stronger, lighter or otherwise more useful compounds than would normally occur randomly.
This metaphor will only be used in this chapter as a kind of intellectual scaffolding as we introduce the grammar. We opted to use it because we're not going to start at the top or the bottom of the conceptual space, but somewhere in the middle.
We will start with our atom - the reputation statement - and describe a grammar sufficient for understanding and diagramming common reputation systems deployed today as well as provide you with the tools to design your own.
<note>The reputation systems graphical grammar is a work that is constantly undergoing change, so be sure to visit this book's companion wiki http://buildingreputation.comfor up-to-date information and to participate in it's evolution..
</WRAP>
When attempting to deconstruct the known universe of online reputation systems, we observed that there was a single common element of reputation that was always the result of these calculations: the reputation statement. As we proceed with diagramming, you will notice that these systems compute many different reputation values that turn out to possess this same elemental form. In practice, many of the most prevalent inputs are also in the same format as well. Just as matter is made up of atoms, reputation is made up of reputation statements.
These atoms are always of the same basic shape, but vary in their specific details. Some are about people, some are about products. Some are numeric, some are votes, some are comments. Many are created directly by users, but a surprising share are created by software.
In chemistry the nature of a single atom of matter always has certain particles (Electrons, Protons, and Neutrons) but in different configuration and number based upon its elemental formulation. The exact formulation of particles in an element causes specific properties when observed in en masse in nature: an element may be stable or volatile, it may be gaseous or solid, and it may be radioactive or inert, amongst many other properties. But every object with mass is made of these atoms.
The reputation statement is like an atom in that it too it has constituent particles: a source, a claim, and a target. See Figure_2-1 . The exact characteristics (type and value) of each of these particles determines its element type and it's utility to your application.
Every reputation statement is made by someone or something; otherwise, it is impossible to evaluate the claim. “Some people say product X is great.” is meaningless, or at least it should be. Who are “some people”? Are they like me? Do they work for the company that makes product X? Without knowing something about who or what made a claim, it has little use.
The claim is the value that was assigned in a statement by the source to the target. Each claim is of a particular score class and has a score value. Figure_2-1 shows a 5-star rating as a score class, and this particular reputation statement has the score value of 3 (stars).
Reputation statements are always focused on some unique identifiable entity-the target of the claim. When queries are made of the reputation database, the target identifier is usually specified when the detail for the target entity in question, say a new eatery, is displayed: “Yahoo! users rated Chipotle Restaurant 4 out of 5 stars for service”. The target is left unspecified (or only partially specified) in database requests based on claims or sources: “What is the best mexican restaurant near here?” or “What are the ratings that Lara gave for restaurants?”
Just like molecules are often made up of many different atoms in various combinations to produce materials with unique and valuable qualities, what makes reputation models so powerful is that they aggregate reputation statements from many sources and often of differing types. Instead of concerning ourselves with valence and Van der Waals forces, in reputation models we bind the atomic units - the reputation statements - together with messages and processes. Messages are represented by arrows and flow in the direction indicated. The boxes contain descriptions of the processes that interpret the activating message to update a reputation statement and/or send one more messages onto other processes. As in chemistry, the entire process is simultaneous - messages can be coming in at any time and may be taking different paths through a complex reputation model at the same time.
Yahoo! Local, Travel, Movies, TV, etc. are all examples of the Ratings and Reviews reputation models. eBay's seller feedback model of users rating transactions and having those reflected on the sellers profile is a Karma reputation model. The example in Figure_2-3 is one of the most simple models possible and was inspired by the digg it vote-to-promote reputation model (see Chapter_7 ) made popular at Digg.com.
Figure_2-3 is the first of many reputation model diagrams that appear in this book. Each will have an accompanying descriptive section explaining each element in detail.
This model is called the Simple Accumulator: It counts votes for a target object. It could be counting click-throughs, thumbs-ups, or marking an item as a favorite.
Though most models have multiple input messages, this one only has one in the form of a single reputation statement:
Likewise, models typically have many processes, this example only has one:
CountOfVotes
counter is incremented and stored in a reputation statement with the claim type of Counter
, set to the value of CountOfVotes
and has the same target as the originating Vote. The source for this statement is said to be aggregate, because it is a roll-up- the product of many inputs from many different sources.See Chapter_4 for a detailed list of common reputation process patterns and Chapter_7 and Chapter_8 for UI effects of various score classes.
Figure_2-4 shows a fuller representation of a Digg.com-like vote-to-promote reputation model. This example determines community interest in an article by tabulating the number of votes it receives in addition to the level of activity in the form of counting the number of comments left by users about the article.
The input messages are in the form of two reputation statements:
The reputation processes in this model are:
CountOfVotes
. It stores the new value back into the statement and sends it in a message to the Community Interest Rank process.ActivityLevel
. It stores the new value back into the statement and sends it in a message to the Community Interest Rank process.Weighting
- to combine the values of CountOfVotes
and ActivityLevel
scores disproportionately to each other - in this example an Endorsement is worth 10x the interest score of a single comment. The resulting Interest
score is stored in a typical roll-up reputation statement: aggregate source, numeric score, and target shared by all of the inputs.
Just as there are some interesting looking molecules in nature and much like hydrogen bonds are especially strong, there special types of reputation statements called containers that join multiple closely-related statements into one super-statement. A common example of this is seen in most product and services websites with user written ratings and comments: a number of different star-ratings for a restaurant are clumped together with a text comment into an object formally called a review. See Figure_2-5 for a typical example.
Containers are useful devices for ordering reputation statements. While it's technically true that each individual component of the container could be represented and addressed as a statement of its own, this would be semantically sloppy, and lead to unnecessary complexity in your model. And it maps well to real-life. We typically don't think of Zary22's series of statements about Dessert Hut as a rapid-fire stream of individual opinions: no, we consider them related, and influenced by each other. Taken as a whole, they formulate Zary22's review of his experience.
<note>A container is a compound reputation statement with multiple claims all for the same source and target.
Once a reputation statement exists in your system, you might opt to make it a reputable entity itself as in Figure_2-6 This should seem perfectly natural to you-in real life, people don't just formulate opinions, right? Quite often, other people form opinions about those opinions! (“Well, Jack hatedThe Dark Knight, but he and I never see eye-to-eye anyway.”)
There is another feature of review based reputation systems, they often implement a form of built in user-feedback on reviews written by other users: we'll call this the Was this Helpful? pattern: Figure_2-7 . When a reader clicks on the thumbs-up stating that a review was helpful or not, the target is a review (container) written earlier by a different user.
The input message is in the form of a single reputation statement:
There is only one reputation process in this model:
TotalVotes
counter is incremented. If the Vote is non-zero, the process increments HelpfulVotes
. Finally HelpfulScore
is set to a text representation of the score suitable for display - “HelpfulVotes
out of TotalVotes
”. This representation is usually stored in the very same review-container that the Voter was judging (had targeted) as helpful. This is done to simplify indexing and retrieval, as in “Retrieve a list of the most helpful movie reviews by MuvyLuvr” and “Sort the list of movie reviews of Aliens by helpful score.” Though the original review writer isn't the author of his helpful votes, his review is responsible for, and should contain them.You'll see variations on this simple pattern of reputation-statements-as-targets repeated throughout the book. It allows us to build some fairly advanced meta-moderation capabilities into our reputation systems. Now, not only can you ask the community “What's good?”-you can also ask it “… and who do you believe?”
For our last chemistry metaphors, consider that physical materials are rarely made out of single type of molecule. We combine the molecules into solutions, compounds, and mixtures to get the exact properties we want. But, not all substances mix well - like oil and water - the same is true for combining multiple reputation model contexts into a single complex reputation system. Care must be taken that the reputation contexts of the models are compatible.
Figure_2-8 shows a simple abuse reporting system that is integrates two different reputation models, a Weighted Voting model that leverages an external karma system providing the weights for the IP addresses of the abuse reporters. This example also illustrates an explicit output, common in many implementations. In this case, the output is an event sent to the application environment suggesting the target comment should be dealt with.
For the Reporter Trustworthiness context the entire reputation model is opaque to our system because it is on a foreign service, namely TrustedSource.org by McAfee, Inc. There is only one input, and it is a bit different than in previous examples:
TrustedSourceReputation
as input using the web service API, here represented as a URL. The result is one of these categories: Trusted
, Neutral
, Unverified
, Suspicious
, or Malicious
which is passed to the Normalize IPTrustScore process.The process represented here represents the transformation of the external IP reputation into the reputation system's normalized range:
TrustedSourceReputation
into WebReputation
with a range from 0.0 (no trust) to 1.0 (maximum trust). This is stored in a reputation statement with the source of TrustedSource.org, claim type Simple Karma with the score WebReputation
, and the target of the IP address.The main context of this reputation is Content Quality, which is designed to collect flags from users whenever they think that the comment in question violates the Terms of Service of the site. When enough users with a high enough reputation for their web provider flag the content, a special event is sent out of the reputation system. This is a Weighted Voting reputation model..
There is one input for this model:
There are two processes for this model, one to accumulate the total abuse score, and another to decide when to alert the outer application:
WebReputation
is stored in a reputation statement with the same target IP address as was provided with the flag input message. The AbuseScore
starts at 0 and is increased by the value of Flag
multiplied by WebReputation
and is stored back into a reputation statement with an aggregate source, numeric score type, and the comment identifier as the target. This statement is passed in a message to the Is Abusive? process.AbuseScore
against an internal constant, AbuseThreshold
in order to decide if it needs to inform the application that the target comment requires special attention. In reputation sandbox implementations that wait for execution to complete, the result is returned as a TRUE
or FALSE
indicating if the comment is considered to be abusive. For high-performance optimistic, or fire-and-forget, reputation platforms, like the one described in Chapter_3 , only when the result is TRUE would an asynchronous alert be triggered.With the introduction of multiple models, external variables, and results handling, our basic modeling grammar is complete.
If you'd like to dive right in to seeing real-world examples of models, peek ahead to Chapter_5 . If you're a bit more technical and want to know more about how to implement reputation sandboxes and how to deal with issues such as reliability, reversibility, and scale, continue on to Chapter_3 . Otherwise, move on to Chapter_4 , where we cover more detail about choosing what kind of reputation components and models are right for your specific application.