# 社交网络分析与挖掘 第六课：网页排序

The History of PageRank PageRank is a link analysis algorithm which assigns a numerical weighting to each Web page，with the purpose of "measuring"relative importance. Based on the hyperlinks map An ex

### The History of PageRank

• PageRank is a link analysis algorithm which assigns a numerical weighting to each Web page，with the purpose of "measuring"relative importance.
• Based on the hyperlinks map
• An excellent way to prioritize the results of web keyword searches
• PageRank was proposed by Larry Page（hencethe name Page-Rank）and Sergey Brin in 1998.
• Shortly after，Page and Brin founded Google.

### PageRank:the intuitive idea

• Although page E hasmore"point to"links than page C, page C is more important than page E.
• One reason is that page C is linked by page B whose is the most important page in this network.

### A Simple Version of PageRank

$R(u)=c \sum_{v \in B_{u}} \frac{R(v)}{N_{v}}$

• u：a web page
• N(v)：the number of forward links of page v
• c：the normalization factor to make

$\|\mathrm{R}\|_{\mathrm{L} 1}=\left|\mathrm{R}_{1}+\ldots+\mathrm{R}_{\mathrm{n}}\right|$

### A Problem with Simplified PageRank

A loop:

During each iteration,the loop accumulates rank but never distributes rank to other pages!

### Fix the problem:what PageRank does

• Randomly walking and also randomly jumping
• the“random surfer”simply keeps clicking successive links at random，but periodically “gets bored”and jumps to a random page based on the distribution

### The actual PageRank

$R^{\prime}(u)=c_{1} \sum_{v \in B_{u}} \frac{R^{\prime}(v)}{N_{v}}+\mathrm{c}_{2} E(u)$

• E(u)：a distribution of ranks of web pages that "users"jump to when they"gets bored"after successive links at random.

### Convergence Experiments

• Scaling factor is roughly linear in logn

### Conclusion

• PageRank rank web pages based on their locations in the web graph structure
• PageRank uses information which is external to the web pages-backlinks
• Backlinks from important pages are more significant than backlinks from common pages
• The structure of the web graph is very useful for information retrieval tasks.