Summary: Policies for Caching OLAP Queries
in Internet Proxies
Thanasis Loukopoulos and Ishfaq Ahmad, Senior Member, IEEE
Abstract--The Internet now offers more than just simple information to the users. Decision makers can now issue analytical, as
opposed to transactional, queries that involve massive data (such as, aggregations of millions of rows in a relational database) in order
to identify useful trends and patterns. Such queries are often referred to as On-Line-Analytical Processing (OLAP). Typically, pages
carrying query results do not exhibit temporal locality and, therefore, are not considered for caching at Internet proxies. In OLAP
processing, this is a major problem as the cost of these queries is significantly larger than that of the transactional queries. This paper
proposes a technique to reduce the response time for OLAP queries originating from geographically distributed private LANs and
issued through the Web toward a central data warehouse (DW) of an enterprise. An active caching scheme is introduced that enables
the LAN proxies to cache some parts of the data, together with the semantics of the DW, in order to process queries and construct the
resulting pages. OLAP queries arriving at the proxy are either satisfied locally or from the DW, depending on the relative access costs.
We formulate a cost model for characterizing the respective latencies, taking into consideration the combined effects of both common
Web access and query processing. We propose a cache admittance and replacement algorithm that operates on a hybrid Web-OLAP
input, outperforming both pure-Web and pure-OLAP caching schemes.
Index Terms--Distributed systems, data communication aspects, Internet applications databases, Web caching, OLAP.
CACHING has emerged as a primary technique for coping
with high latencies experienced by the Internet users.