Table of Contents

A (Graphical) Grammar for Reputation

The phrase reputation system describes a wide array of practices, technologies, and user interface elements. In this chapter, we'll build a visual “grammar” to describe the attributes, processes, and presentation of reputation systems. We'll be using this grammar throughout subsequent chapters to describe existing reputation systems and define new ones. Furthermore, you should be able to use this grammar as well-both to understand and diagram common reputation systems, and to design systems of your own.

Meta-modeling

A formalized specification of domain-specific notations … following a strict rule set.

Much reputation-related terminology is inconsistent, confusing, and even contradictory, depending on what site you visit or which expert opinion you read. Over the last 30 years, we've evaluated and developed scores of online and offline reputation systems. We've identified many concepts and attributes common to them all: enough similarity that we're proposing a common lexicon and a “graphical grammar” -the common concepts, attributes, and methods involved-to build a foundation for a shared understanding of reputation systems.

Why propose a graphical grammar? Reputation is an abstract concept and deploying it usually requires the participation of many people. In practice, we've consistently seen that having a two-dimensional drawing of the reputation model facilitates the design and implementation phases of the project. Capturing the relationships between the inputs, messages, processes, and outputs in a compact, simple, and accessible format is indispensable. Think of it like a screen mock, but for a critical, normally invisible, part of the application.

In describing this grammar, we'll borrow a metaphor from basic chemistry: atoms (reputation statements) and their constituent particles (sources, claims, and targets) are bound with forces (messages and processes) to make up molecules (reputation models), which constitute the core useful substances in the universe. Sometimes different molecules are mixed in solutions (reputation systems) to catalyze the creation of stronger, lighter, or otherwise more useful compounds.

The graphical grammar of reputation systems is continually evolving as the result of changing markets and technologies. Visit this book's companion wiki at http://buildingreputation.comfor up-to-date information and to participate in this grammar's evolution.

The Reputation Statement and Its Components

Figure_2-1: Much like in archery, anyone can fire a claim at anything. It doesn't necessarily mean the claim is accurate. Throughout this book, claims will be represented by this stylized arrow shape.

As we proceed with our grammar, you'll notice that reputation systems compute many different reputation values that turn out to possess a single common element: the reputation statement. In practice, most inputs to a reputation model are either already in the form of reputation statements or are quickly transformed into them for easy processing.

Just as matter is made up of atoms, reputation is made up of reputation statements.

Like atoms, reputation statements always have the same basic components (electrons, protons, neutrons), but they vary in specific details. Some are about people, some are about products. Some are numeric, some are votes, some are comments. Many are created directly by users, but a surprising number are created by software.

Any single atom always has certain particles (protons, neutrons, and electrons). The configurations and numbers of those particles determine the specific properties of an element when it occurs en masse in nature: for example, an element may be stable or volatile, gaseous or solid, and radioactive or inert. But every object with mass is made up of atoms.

The reputation statement is like an atom in that it too it has constituent particles: a source, a claim, and a target. See Figure_2-1 . The exact characteristics (type and value) of each particle determine what type of element it is and its use in your application.

Reputation Sources: Who [or What] Is Making a Claim?

Every reputation statement is made by someone or something. A claim whose author is unknown is impossible to evaluate: the statement “Some people say product X is great” is meaningless. Who are “some people” ? Are they like me? Do they work for the company that makes product X? Without knowing something about who or what made a claim, you can make little use of it.

We start building our grammar from the ground up-we need a few primitive objects: Entities, Sources, Users and a special group object-the Aggregate Source..

This sounds like a special exception, but it isn't-this is the very nature of reputation systems, even in life: claims that a movie is number one at the box-office don't give you a detailed list of everyone who bought a ticket, nor should they. That claim always comes with the name of an aggregation source, such as “… according to Billboard Magazine.”

Reputation Claims: What Is the Target's Value to the Source? On What Scale?

The claim is the value that the source assigned to the target in the reputation statement. Each claim is of a particular claim type and has a claim value. Figure_2-1 shows a claim with a 5-star rating claim type, and this particular reputation statement has a claim value of 4 (stars).

Figure_2-2: A number of common claim types, targeted at a variety of reputable entities.

A normalized score is often easier to read than trying to guess what 3-stars means - since we're trained to understand the 0-100 scale early in life and the transformation of a normalized number to 0-100 is trivial to do in one's head. For example, if the community indicated that it was 0.9841 (normalized) in support of your product, you instantly know this is a very good thing.

Reputation Targets: What (or Who) Is the Focus of a Claim?

A reputation statement is always focused on some unique identifiable entity-the target of the claim. Reputations are assigned to targets, for example, a new eatery. Later, the application queries the reputation database supplying the same eatery's entity identifier to retrieve its reputation for display: “Yahoo! users rated Chipotle Restaurant 4 out of 5 stars for service.” The target is left unspecified (or only partially specified) in database requests based on claims or sources: “What is the best Mexican restaurant near here?” or “What are the ratings that Lara gave for restaurants?”

Molecules: Constructing Reputation Models Using Messages and Processes

Figure_2-3: This is almost the simplest reputation model you'll find. Users endorse articles, and the sum of their votes is displayed by that article.

Just as molecules are often made up of many different atoms in various combinations to produce materials with unique and valuable qualities, what makes reputation models so powerful is that they aggregate reputation statements from many sources and often statements of different types. Instead of concerning ourselves with valence and Van der Waals forces, in reputation models we bind the atomic units-the reputation statements-together with messages and processes.

In the simple reputation model presented in Figure_2-3 messages are represented by arrows and flow in the direction indicated. The boxes are the processes and contain descriptions of the processes that interpret the activating message to update a reputation statement and/or send one more messages on to other processes. As in chemistry, the entire process is simultaneous-messages may come in at any time, and multiple messages may take different paths through a complex reputation model at the same time.

As we proceed, people often become confused about the limited the scope of reputation, and where to draw the lines between multiple reputations, so we need a few definitions:

Yahoo! Local, Travel, Movies, TV, etc. are all examples of ratings-and-reviews reputation models. EBay's seller feedback model, in which users' ratings of transactions are reflected in sellers' profiles, is a karma reputation model. The example in Figure_2-3 is one of the simplest possible models and was inspired by the Digg it vote-to-promote reputation model (see Chapter_6 ) made popular by Digg.com.

Messages and Processes

Again, look at the simplest reputation model diagram, Figure_2-3 : the input reputation statement appears on the left and is delivered as a message to the reputation process box. Messages and processes make up the working mechanics of the reputation model.

If, on the other hand, an event may need to be displayed or reversed in the future (e.g., if a user abuses the reputation model), it is said to be reversible and must be stored either in an external file such as a log or as a reputation statement. Most rating-and-review models have reversible inputs. But very large-scale systems, such as IP address reputations that identify mail spammers, it's too costly to store a separate input event for every email received. For those reputation models, the transient input method is appropriate.

Many reputation processes use message input to transform a reputation statement. In our example in Figure_2-3 , when a user clicks “I Digg this URL,” the application sends the input event to a reputation process that is a simple counter: CountOfVotes . This counter is a stored reputation value which is read for its current value, then incremented by one, and then is stored again. This brings the reputation database up to date and the application may use the target identifier (in Digg's case, a URL) to get the claim value.

Reputation Model Explained: Vote-To-Promote

Figure_2-3 is the first of many reputation model diagrams in this book. We follow each diagram with a detailed explanation.

This model is the simple accumulator: It counts votes for a target object. It can count click-throughs or thumbs-ups, or it can mark an item as a favorite.

Even though most models allow multiple input messages, for clarity we're presenting a simplified model that only has one, in the form of a single reputation statement.

  1. As users take actions, they cause votes to be recorded and start the reputation model by sending them as a messages (represented by the arrows) to the raw sum of votes process.

Likewise, whereas models typically have many processes, this example has only one.

  1. Raw sum of votes: When vote messages arrive, the CountOfVotes counter is incremented and stored in a reputation statement of the claim type Counter , set to the value of CountOfVotes and with the same target as the originating vote. The source of this statement is said to be aggregate because it is a roll-up-the product of many inputs from many different sources.

See Chapter_3 for a detailed list of common reputation process patterns and Chapter_6 and Chapter_7 for a discussion of the effects of various score classes on a user interface.

Building on the Simplest Model

Figure_2-4: A slightly-more-evolved model. Now, articles are ranked not only according to endorsements, but also the amount of discussion they generate.

Figure_2-4 shows a fuller representation of a Digg.com-like vote-to-promote reputation model. This example adds a new element to determining community interest in an article: adding a reputation for the level of user-activity measured by comments left on the target entity. These two are weighted and combined to product a combined rating.

The input messages take the form of two reputation statements:

  1. When a user endorses an article the thumbs-up vote is represented as a 1.0 and sent as a message to the raw sum of votes process. If a previous thumbs-up vote is withdrawn, a score of 0.0 score is sent instead.
  2. When a user comments on an article it is counted as activity and represented as a 1.0 and sent as a message to the level of activity process. If the comment is later deleted by the user a score of 0.0 score is sent to undo the earlier vote.

This model involves the following reputation processes:

  1. Raw sum of votes: This process either increments (if the input is 1.0) or decrements a roll-up reputation statement containing a simple accumulator called CountOfVotes . The process stores the new value and sends it in a message to another process, community interest rank.
  2. Level of activity: This process either increments (if the input is 1.0) or decrements a roll-up reputation statement containing a simple accumulator called ActivityLevel . It stores the new value back into the statement and sends it in a message to another process, community interest rank.
  3. Community interest rank: This process always recalculates a roll-up reputation statement containing a weighted sum called interest, which is the value that the application uses to rank the target article in search results and in other page displays. The calculation uses a local constant-Weighting -to combine the values of CountOfVotes and ActivityLevel scores disproportionately; in this example, an endorsement is worth 10 times the interest score of a single comment. The resulting Interest score is stored in a typical roll-up reputation statement: aggregate source, numeric score, and target shared by all of the inputs.

Complex Behavior: Containers and Reputation Statements as Targets

Figure_2-5: A reputation container: some claims make more sense when considered together than standing alone.

Just as there exist some interesting looking molecules in nature and much like hydrogen bonds are especially strong, certain types of reputation statements called containers join multiple closely related statements in one super-statement. Most websites with user-generated ratings and comments for products or services provide examples of this kind of reputation statement: they clump together different star ratings with a text comment into an object formally called a review. See Figure_2-5 for a typical example, restaurant reviews.

Containers are useful devices for determining the order of reputation statements. While it's technically true that each individual component of a container could be represented and addressed as a statement of its own, that arrangement would be semantically sloppy and lead to unnecessary complexity in your model. The container model maps well to real life. You probably wouldn't think of the series of statements about Dessert Hut made by user Zary22 as a rapid-fire stream of individual opinions; you'd more likely consider them related, and influenced by one another. Taken as a whole, the statements express Zary22's experience at Dessert Hut.

A container is a compound reputation statement with multiple claims for the same source and target.

Figure_2-6: A reputation statement can itself be the target of a helpful vote. Here, MuvyLuvrhas written a review that others can then rate.

Once a reputation statement exists in your system, consider how you might make it a reputable entity itself, as in Figure_2-6 This indirection provides for subtle and powerful feedback. For example, people regularly form their own opinions about the opinions of others based on some external criteria or context. (“Jack hatedThe Dark Knight, but he and I never see eye-to-eye anyway.” )

Figure_2-7: Helpful votes for a review are rolled up into an aggregate HelpfulScore. It's often more efficient to just store this score back to the review container for easy retrieval.

Another feature of review-based reputation systems is that they often incorporate a form of built-in user feedback about reviews written by other users. We'll call this feature the was this helpful? pattern. (See Figure_2-7 .) When a user indicates whether a review was helpful or not, the target is a review (container) written earlier by a different user.

The input message takes the form of a single reputation statement:

  1. A user votes on the quality of another reputation statement, in this case a review: a thumbs-up vote is represented by a 1.0 value, and a thumbs-down vote by a 0.0 value.

This model includes only one reputation process:

  1. Calculate helpful score: When the message arrives load the TotalVotes stored reputation value, increment, and store it. If the vote is not zero, the process increments HelpfulVotes . Finally set the HelpfulScore to a text representation of the score suitable for display: “HelpfulVotes out of TotalVotes .” This representation is usually stored in the very same review container that the voter was judging (i.e., had targeted) as helpful. This configuration simplifies indexing and retrieval; e.g., “Retrieve a list of the most helpful movie reviews by MuvyLuvr” and “Sort the list of movie reviews of Aliens by helpful score.” Though the original review writer isn't the author of his helpful votes, his review is responsible for them and should contain them.

You'll see variations on this simple pattern of reputation-statements-as-targets repeated throughout this book. It makes it possible to build fairly advanced meta-moderation capabilities into a reputation system. Not only can you ask a community “What's good?” -you can also ask “…and whom do you believe?”

Solutions: Mixing Models to Make Systems

Figure_2-8: Two or more separate models can work in symphony to create a larger, more robust reputation system.

In one more invocation of our chemistry metaphor, consider that physical materials are rarely made of a single type of molecule. Chemists combine molecules into solutions, compounds, and mixtures to get the exact properties they want. But not all substances mix well-oil and water, for example. The same is true in combining multiple reputation model contexts in a single reputation system. It's important to combine only models with compatible reputation contexts.

Figure_2-8 shows a simple abuse reporting system that integrates two different reputation models in a single weighted voting model that takes weights for the IP addresses of the abuse reporters from an external karma system. This example also illustrates an explicit output, common in many implementations. In this case, the output is an event sent to the application environment suggesting that the target comment be dealt with. In this case, the content would need to be reviewed and either hidden or destroyed. Many services also consider punitive action against the content creator's account, such as suspension of access.

For the reporter trustworthiness context, the inputs and mechanism of this external reputation model are opaque-we don't know how the model works-because it is on a foreign service, namely TrustedSource.org by McAfee, Inc. Their service provides us one input, and it is different from previous examples:

  1. When a reputation system is prompted to request a new trust score for a particular IP address-perhaps by a periodic timer or on demand by external means-it retrieves the TrustedSourceReputation as input using the web service API, here represented as a URL. The result is one of the following categories: Trusted , Neutral , Unverified , Suspicious , or Malicious , which the system passes to the Normalize IPTrustScore process.

This message arrives at a reputation process that transforms the external IP reputation, in a published API format, into the reputation system's normalized range:

  1. Normalize IPTrustScore: Using a transformation table, the system normalizes the TrustedSourceReputation into WebReputation with a range from 0.0 (no trust) to 1.0 (maximum trust). The system stores the normalized score in a reputation statement with a source of TrustedSource.org, claim type simple karma, with the claim value equal to the normalized WebReputation , and a target of the IP address being evaluated.

The main context of this reputation is content quality, which is designed to collect flags from users whenever they think that the comment in question violates the site's terms of service. When enough users whose web providers have a high enough reputation flag the content, the reputation system sends out a special event. This reputation model is a weighted voting model..

This model has one input:

  1. Input: A user, connected using a specific IP address, flags a target comment as violating the site's terms of service. The value of the flag is always 1.0 and is sent to the abusive content score process.

This model involves two processes: one to accumulate the total abuse score, and another to decide when to alert the outer application.

  1. The abusive content score process uses one external variable: WebReputation is stored in a reputation statement with the same target IP address as was provided with the flag input message. The AbuseScore starts at 0 and is increased by the value of Flag multiplied by WebReputation . The system stores the score in a reputation statement with an aggregate source, numeric score type, and the comment identifier as the target, then passes the statement in a message to the is abusive? process.
  2. Is abusive? then tests the AbuseScore against an internal constant, AbuseThreshold , in order to determine whether to highlight the target comment for special attention by the application. In a simple (request-reply) framework implementation, this reputation system returns the result of the Is abusive? process as TRUE or FALSE to indicate whether the comment is considered to be abusive. For high-performance asynchronous (fire-and-forget) reputation platforms like the one described in Appendix_A an alert is triggered only when the result is TRUE.

From Reputation Grammar To…

This chapter defined the graphical reputation grammar, from bottom (reputation statements made of sources, claims, and targets) to top (models, systems, and frameworks). All members of any team defining a reputation-enhanced application should find this reference material helpful in understanding their reputation systems.

But at this point, different team functional roles might want to look at different parts of the book next: