Worcester Polytechnic Institute Electronic Theses and Dissertations Collection

Title page for ETD etd-081905-093754


Document Typedissertation
Author NameLiu, Bin
Email Address binliu at cs.wpi.edu
URNetd-081905-093754
TitleScalable Integration View Computation and Maintenance with Parallel, Adaptive and Grouping Techniques
DegreePhD
DepartmentComputer Science
Advisors
  • Elke A. Rundensteiner, Advisor
  • David Finkel, Committee Member
  • Murali Mani, Committee Member
  • Paul Larson, Committee Member
  • Michael Gennert, Department Head
  • Keywords
  • parallel multi-join computation
  • state level adaptation
  • materialized view maintenance
  • grouping maintenance
  • cyclic join views
  • distributed data sources
  • Date of Presentation/Defense2005-08-19
    Availability unrestricted

    Abstract

    Materialized integration views constructed by integrating data from multiple distributed data sources help to achieve better access, reliable performance, and high availability for a wide range of applications. In this dissertation, we propose parallel, adaptive, and grouping techniques to address scalability challenges in high-performance integration view computation and maintenance due to increasingly large data sources and high rates of source updates.

    State-of-the-art parallel integration view computation makes the common assumption that the maximal pipelined parallelism leads to superior performance. We instead propose segmented bushy parallel processing that combines pipelined parallelism with alternate forms of parallelism to achieve an overall more effective strategy. Experimental studies conducted over a cluster of high-performance PCs confirm that the proposed strategy has an on average of 50\% improvement in terms of total processing time in comparison to existing solutions.

    Run-time adaptation becomes critical for parallel integration view computation due to its long running and memory intensive nature. We investigate two types of state level adaptations, namely, state spill and state relocation, to address the run-time memory shortage. We propose lazy-disk and active-disk approaches that integrate both adaptations to maximize run-time query throughput in a memory constrained environment. We also propose global throughput-oriented state adaptation strategies for computation plans with multiple state intensive operators. Extensive experiments confirm the effectiveness of our proposed adaptation solutions.

    Once results have been computed and materialized, it's typically more efficient to maintain them incrementally instead of full recomputation. However, state-of-the-art incremental view maintenance require O($n^2$) maintenance queries with n being the number of data sources that the view is defined upon. Moreover, they do not exploit view definitions and data source processing capabilities to further improve view maintenance performance. We propose novel grouping maintenance algorithms that dramatically reduce the number of maintenance queries to (O(n)). A cost-based view maintenance framework has been proposed to generate optimized maintenance plans tuned to particular environmental settings. Extensive experimental studies verify the effectiveness of our maintenance algorithms as well as the maintenance framework.

    Files
  • bliu.pdf

  • Browse by Author | Browse by Department | Search all available ETDs

    [WPI] [Library] [Home] [Top]

    Questions? Email etd-questions@wpi.edu
    Maintained by webmaster@wpi.edu