Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Higher-order web link analysis using multilinear algebra.

Technical Report ·
DOI:https://doi.org/10.2172/974401· OSTI ID:974401

Linear algebra is a powerful and proven tool in web search. Techniques, such as the PageRank algorithm of Brin and Page and the HITS algorithm of Kleinberg, score web pages based on the principal eigenvector (or singular vector) of a particular non-negative matrix that captures the hyperlink structure of the web graph. We propose and test a new methodology that uses multilinear algebra to elicit more information from a higher-order representation of the hyperlink graph. We start by labeling the edges in our graph with the anchor text of the hyperlinks so that the associated linear algebra representation is a sparse, three-way tensor. The first two dimensions of the tensor represent the web pages while the third dimension adds the anchor text. We then use the rank-1 factors of a multilinear PARAFAC tensor decomposition, which are akin to singular vectors of the SVD, to automatically identify topics in the collection along with the associated authoritative web pages.

Research Organization:
Sandia National Laboratories
Sponsoring Organization:
USDOE
DOE Contract Number:
AC04-94AL85000
OSTI ID:
974401
Report Number(s):
SAND2005-4548
Country of Publication:
United States
Language:
English

Similar Records

Graph Mining Meets the Semantic Web
Conference · Wed Dec 31 23:00:00 EST 2014 · OSTI ID:1190754

Multilinear algebra for analyzing data with multiple linkages.
Conference · Wed Oct 01 00:00:00 EDT 2008 · OSTI ID:966593

Multilinear operators for higher-order decompositions.
Technical Report · Fri Mar 31 23:00:00 EST 2006 · OSTI ID:923081