Summary: Internet Mathematics Vol. 1, No. 3: 335-380
Deeper Inside PageRank
Amy N. Langville and Carl D. Meyer
Abstract. This paper serves as a companion or extension to the "Inside PageRank"
paper by Bianchini et al. [Bianchini et al. 03]. It is a comprehensive survey of all
issues associated with PageRank, covering the basic PageRank model, available and
recommended solution methods, storage issues, existence, uniqueness, and convergence
properties, possible alterations to the basic model, suggested alternatives to the tradi-
tional solution methods, sensitivity and conditioning, and finally the updating problem.
We introduce a few new results, provide an extensive reference list, and speculate about
exciting areas of future research.
Many of today's search engines use a two-step process to retrieve pages related
to a user's query. In the first step, traditional text processing is done to find
all documents using the query terms, or related to the query terms by semantic
meaning. This can be done by a look-up into an inverted file, with a vector space
method, or with a query expander that uses a thesaurus. With the massive size
of the web, this first step can result in thousands of retrieved pages related to
the query. To make this list manageable for a user, many search engines sort
this list by some ranking criterion. One popular way to create this ranking is