Worcester Polytechnic Institute Electronic Theses and Dissertations Collection

Title page for ETD etd-043012-104639


Document Typethesis
Author NameTrivedi, Shubhendu
URNetd-043012-104639
TitleA Graph Theoretic Clustering Algorithm based on the Regularity Lemma and Strategies to Exploit Clustering for Prediction
DegreeMS
DepartmentComputer Science
Advisors
  • Neil T. Heffernan, Advisor
  • Gabor N. Sarkozy, Advisor
  • Sonia Chernova, Reader
  • Keywords
  • Machine Learning
  • Graph Mining
  • Unsupervised Learning
  • Ensemble Learning
  • Semi-Supervised Learning
  • Regularity Lemma
  • Graph Partitioning
  • Date of Presentation/Defense2012-04-25
    Availability unrestricted

    Abstract

    The fact that clustering is perhaps the most used technique for exploratory data analysis is only a semaphore that underlines its fundamental importance. The general problem statement that broadly describes clustering as the identification and classification of patterns into coherent groups also implicitly indicates it's utility in other tasks such as supervised learning. In the past decade and a half there have been two developments that have altered the landscape of research in clustering: One is improved results by the increased use of graph theoretic techniques such as spectral clustering and the other is the study of clustering with respect to its relevance in semi-supervised learning i.e. using unlabeled data for improving prediction accuracies. In this work an attempt is made to make contributions to both these aspects. Thus our contributions are two-fold: First, we identify some general issues with the spectral clustering framework and while working towards a solution, we introduce a new algorithm which we call "Regularity Clustering" which makes an attempt to harness the power of the Szemeredi Regularity Lemma, a remarkable result from extremal graph theory for the task of clustering. Secondly, we investigate some practical and useful strategies for using clustering unlabeled data in boosting prediction accuracy. For all of these contributions we evaluate our methods against existing ones and also apply these ideas in a number of settings.

    Files
  • trivedi.pdf

  • Browse by Author | Browse by Department | Search all available ETDs

    [WPI] [Library] [Home] [Top]

    Questions? Email etd-questions@wpi.edu
    Maintained by webmaster@wpi.edu