Document Type thesis Author Name Deschler, Kurt W URN etd-0506102-113510 Title MASS: A Multi-Axis Storage Structure for Large XML Documents Degree MS Department Computer Science Advisors Elke A. Rundensteiner, Advisor Carolina Ruiz, Reader Micha Hofri, Department Head Keywords XML path expression axis order indexing inlined compression XPath lossless Date of Presentation/Defense 2002-04-29 Availability unrestricted
Due to the wide acceptance of the Word Wide Web Consortium (W3C) XPath language specification, native indexing for XML is needed to support path expression queries efficiently. XPath describes the different document tree relationships that may be queried as a set of axes. Many recent proposals for XML indexing focus on accelerating only a small subset of expressions possible using these axes. In particular, queries by ordinal position and updates that alter document structure are not well supported. A more general indexing solution is needed that not only offers efficient evaluation of all of the XPath axes, but also allows for efficient document update.
We introduce MASS, a Multiple Axis Storage Structure, to meet the performance challenge posed by the XPath language. MASS is a storage and indexing solution for large XML documents that eliminates the need for external secondary storage. It is designed around the XPath language, providing efficient interfaces for evaluating all XPath axes. The clustered organization of MASS allows several different axes to be evaluated using the same index structure. The clustering, in conjunction with an internal compression mechanism exploiting specific XML characteristics, keep the size of the structure small which further aids efficiency. MASS introduces a versatile scheme for representing document node relationships that always allows for efficient updates. Finally, the integration of a ranked B+ tree allows MASS to efficiently evaluate XPath axes in large documents.
We have implemented MASS in C++ and measured the performance of many different XPath expressions and document updates. Our experimental evaluation illustrates that MASS exhibits excellent performance characteristics for both queries and updates and scales well to large documents, making it a practical solution for XML storage. In conjunction with text indexing, MASS provides a complete solution from XML indexing.
Browse by Author | Browse by Department | Search all available ETDs
Questions? Email firstname.lastname@example.org