Reconciling TunkRank and PageRank

March 13, 2009 at 12:28 am | Posted in research, socialmedia | 1 Comment
Tags: , , ,

These days I encountered some twitter.com pages (incl. tweets and twitter user profile pages) in some Google search results. This strikes me on the basic fact that Google already assigns a PageRank score to each tweet and twitter user when it crawls the site twitter.com.  Can we just use the search giant to rank tweets and twitter users?

TunkRank, designed by Daniel Tunkelang, measures the influence of a twitter user by the possible number of times his tweet is read, and then retweeted by his followers, the followers of these followers, and so on. In his algorithm, Daniel recognizes that a user following too many people cannot pay much attention to tweets of his friends. The Influence(X) of a user X is calculated as follows, where p is the probability that an arbitrary user retweets.

Influence(X) = \sum_{Y\in Follower(X)} (1+p\times Influence(Y) ) / | Following(Y)|

Daniel also acknowledged that Influence() resembles PageRank. To make the similarity easier to see, let’s write Y \rightarrow X to mean that user Y follows X, and denote TunkRank Influence(X) measure using a shorter notation TR(X):

TR(X) = \sum_{ Y\rightarrow X} (1+p\times TR(Y)) / | \{ z: Y\rightarrow z \} | ,

i.e.,

TR(X) = A(X) + p \sum_{ Y\rightarrow X} TR(Y) / | \{ z: Y\rightarrow z \} |

where A(X) = \sum_{ Y\rightarrow X} 1/| \{ z: Y\rightarrow z \} | measures the degree of attention the user X receives in the twittersphere. | \{ z: Y\rightarrow z \} | is the number of users Y follows.

Compare this with PageRank PR(X) for a web page X, with X \rightarrow Y meaning that page X links to page Y, d the dumping factor, and N the total number of pages. (from Wikipedia)

PR(X) = \frac{1-d}{N} + d \sum_{Y \rightarrow X} PR(Y) / | \{ z: Y\rightarrow z\}|

This formulation reveals the similarity between TunkRank and PageRank. We can say that TunkRank exhibits a ‘random meme back-tracker’ model, in the same spirit as ‘random surfer’ model in PageRank. The back-tracker traces a tweet from a follower to a followee. Occasionally, the back-tracker randomly jumps to a user with probability proportional to the attention A(X) that user enjoys.

We may ask Google to approximately calculate TR(X) by making one page for each user, and linking followers’ page to followees’ pages. Notice that the non-uniform attention estimate A(X) in Daniel’s algorithm might be an important insight to rank twitter users.

Advertisements

1 Comment »

RSS feed for comments on this post. TrackBack URI

  1. Thanks for the analysis–it’s nice to see it laid out formally. And yes, the on-uniform attention estimate is what I feel is my main contribution to this discussion.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Create a free website or blog at WordPress.com.
Entries and comments feeds.

%d bloggers like this: