Worcester Polytechnic Institute Electronic Theses and Dissertations Collection

Title page for ETD etd-081606-083026

Document Typethesis
Author NameHayward, John T
TitleMining Oncology Data: Knowledge Discovery in Clinical Performance of Cancer Patients
DepartmentComputer Science
  • Professor Carolina Ruiz, Advisor
  • Professor Murali Mani, Reader
  • Professor Michael Gennert, Department Head
  • Keywords
  • Clinical Performance
  • Databases
  • Cancer
  • oncology
  • Knowledge Discovery in Databases
  • data mining
  • Date of Presentation/Defense2006-08-24
    Availability unrestricted


    Our goal in this research is twofold: to develop clinical performance databases of cancer patients, and to conduct data mining and machine learning studies on collected patient records. We use these studies to develop models for predicting cancer patient medical outcomes. The clinical database is developed in conjunction with surgeons and oncologists at UMass Memorial Hospital. Aspects of the database design and representation of patient narrative are discussed here. Current predictive model design in medical literature is dominated by linear and logistic regression techniques. We seek to show that novel machine learning methods can perform as well or better than these traditional techniques.

    Our machine learning focus for this thesis is on pancreatic cancer patients. Classification and regression prediction targets include patient survival, wellbeing scores, and disease characteristics. Information research in oncology is often constrained by type variation, missing attributes, high dimensionality, skewed class distribution, and small data sets. We compensate for these difficulties using preprocessing, meta-learning, and other algorithmic methods during data analysis. The predictive accuracy and regression error of various machine learning models are presented as results, as are t-tests comparing these to the accuracy of traditional regression methods. In most cases, it is shown that the novel machine learning prediction methods offer comparable or superior performance. We conclude with an analysis of results and discussion of future research possibilities.

  • jhayward.pdf

  • Browse by Author | Browse by Department | Search all available ETDs

    [WPI] [Library] [Home] [Top]

    Questions? Email etd-questions@wpi.edu
    Maintained by webmaster@wpi.edu