Worcester Polytechnic Institute Electronic Theses and Dissertations Collection

Title page for ETD etd-122005-193617


Document Typedissertation
Author NameChen, Songting
Email Address chenst at cs.wpi.edu
URNetd-122005-193617
TitleEfficient Incremental View Maintenance for Data Warehousing
DegreePhD
DepartmentComputer Science
Advisors
  • Elke A. Rundensteiner, Advisor
  • Carolina Ruiz, Committee Member
  • Murali Mani, Committee Member
  • Latha S. Colby, Committee Member
  • Keywords
  • View Matching
  • View Maintenance
  • Materialized View
  • Data Warehouse
  • Information Integration
  • Date of Presentation/Defense2005-09-06
    Availability unrestricted

    Abstract

    Data warehousing and on-line analytical processing (OLAP) are essential

    elements for decision support applications. Since most OLAP queries are

    complex and are often executed over huge volumes of data, the solution in

    practice is to employ materialized views to improve query performance.

    One important issue for utilizing materialized views is to maintain the

    view consistency upon source changes. However, most prior work focused on

    simple SQL views with distributive aggregate functions, such as SUM and COUNT.

    This dissertation proposes to consider broader types of views than previous

    work. First, we study views with complex aggregate functions such as variance

    and regression. Such statistical functions are of great importance in practice.

    We propose a workarea function model and design a generic framework to tackle

    incremental view maintenance and answering queries using views for such functions.

    We have implemented this approach in a prototype system of IBM DB2. An

    extensive performance study shows significant performance gains by our techniques.

    Second, we consider materialized views with PIVOT and UNPIVOT operators.

    Such operators are widely used for OLAP applications and for querying

    sparse datasets. We demonstrate that the efficient maintenance of views

    with PIVOT and UNPIVOT operators requires more generalized

    operators, called GPIVOT and GUNPIVOT. We formally define and prove the

    query rewriting rules and propagation rules for such

    operators. We also design a novel view maintenance framework for applying

    these rules to obtain an efficient maintenance plan. Extensive

    performance evaluations reveal the effectiveness of our techniques.

    Third, materialized views are often integrated from multiple data sources.

    Due to source autonomicity and dynamicity, concurrency may occur

    during view maintenance. We propose a generic concurrency control

    framework to solve such maintenance anomalies.

    This solution extends previous work in that it solves the anomalies under

    both source data and schema changes and thus achieves full source

    autonomicity. We have implemented this technique in a data warehouse

    prototype developed at WPI. The extensive performance study shows that

    our techniques put little extra overhead on existing concurrent data

    update processing techniques while allowing for this new functionality.

    Files
  • schen.pdf

  • Browse by Author | Browse by Department | Search all available ETDs

    [WPI] [Library] [Home] [Top]

    Questions? Email etd-questions@wpi.edu
    Maintained by webmaster@wpi.edu