## Reconciling TunkRank and PageRank

March 13, 2009 at 12:28 am | Posted in research, socialmedia | 1 Comment
Tags: , , ,

These days I encountered some twitter.com pages (incl. tweets and twitter user profile pages) in some Google search results. This strikes me on the basic fact that Google already assigns a PageRank score to each tweet and twitter user when it crawls the site twitter.com.  Can we just use the search giant to rank tweets and twitter users?

TunkRank, designed by Daniel Tunkelang, measures the influence of a twitter user by the possible number of times his tweet is read, and then retweeted by his followers, the followers of these followers, and so on. In his algorithm, Daniel recognizes that a user following too many people cannot pay much attention to tweets of his friends. The Influence(X) of a user X is calculated as follows, where p is the probability that an arbitrary user retweets.

$Influence(X) = \sum_{Y\in Follower(X)} (1+p\times Influence(Y) ) / | Following(Y)|$

Daniel also acknowledged that Influence() resembles PageRank. To make the similarity easier to see, let’s write $Y \rightarrow X$ to mean that user Y follows X, and denote TunkRank Influence(X) measure using a shorter notation TR(X):

$TR(X) = \sum_{ Y\rightarrow X} (1+p\times TR(Y)) / | \{ z: Y\rightarrow z \} |$,

i.e.,

$TR(X) = A(X) + p \sum_{ Y\rightarrow X} TR(Y) / | \{ z: Y\rightarrow z \} |$

where $A(X) = \sum_{ Y\rightarrow X} 1/| \{ z: Y\rightarrow z \} |$ measures the degree of attention the user X receives in the twittersphere. $| \{ z: Y\rightarrow z \} |$ is the number of users Y follows.

Compare this with PageRank PR(X) for a web page X, with $X \rightarrow Y$ meaning that page X links to page Y, d the dumping factor, and N the total number of pages. (from Wikipedia)

$PR(X) = \frac{1-d}{N} + d \sum_{Y \rightarrow X} PR(Y) / | \{ z: Y\rightarrow z\}|$

This formulation reveals the similarity between TunkRank and PageRank. We can say that TunkRank exhibits a ‘random meme back-tracker’ model, in the same spirit as ‘random surfer’ model in PageRank. The back-tracker traces a tweet from a follower to a followee. Occasionally, the back-tracker randomly jumps to a user with probability proportional to the attention A(X) that user enjoys.

We may ask Google to approximately calculate TR(X) by making one page for each user, and linking followers’ page to followees’ pages. Notice that the non-uniform attention estimate A(X) in Daniel’s algorithm might be an important insight to rank twitter users.