COVID-19 Resources & Updates Read More

Computer Science Department, PhD Dissertation Defense, Yizhou Yan " Contextual Outlier Detection from Heterogeneous Data Sources"

Thursday, March 26, 2020
9:00 am to 10:00 am
Floor/Room #: 
Taylor Conference Room

Yizhou Yan

PhD candidate

WPI - Computer Science
Thursday, March 26, 2020

Time: 9:00 a.m.- 10:00 a.m.

Location: Ruben Campus Center/ Taylor Conference Room

Committee members:
Prof. Elke A. Rundensteiner (advisor, WPI-Computer Science)

Prof. Samuel Madden (co-advisor, MIT-CSAIL)

Prof. Mohamed Y. Eltabakh (WPI-Computer Science)

Prof. Xiangnan Kong (WPI-Computer Science)

The dissertation focuses on detecting contextual outliers from heterogeneous data sources. Modern sensor-based applications are generating a huge amount of heterogeneous data. Detecting outliers from such data sources is critical so to be able to diagnose and fix malfunctioning systems, prevent cyber attacks, and even save human lives. The outlier detection techniques in the literature typically rely on the probability density at each point to detect outliers. However, in many cases, the outlierness of an object has to take into consideration the context in which this object occurs. Within this scope, my dissertation will focus on four research innovations, namely techniques and systems for scalable contextual outlier detection from multi-dimensional data points, for detecting contextual outlier patterns from sequence data, for identifying contextual outliers from image data sets, and lastly an integrative end-to-end outlier detection system capable of doing automatic outlier detection, outlier summarization and outlier explanation.
First, I design novel distributed contextual outlier detection strategies to optimize the key factors determining the efficiency of distributed data analytics, namely, communication costs and load balancing. Second, I define new context-aware pattern mining semantics and then develop innovative efficient mining strategies to support these new semantics. In addition, methodologies that continuously extract outlier patterns from sequence streams are also developed. Third, we propose an novel Unknown-aware Deep Neural Network (UDN for short) to detect contextual image outliers. Finally, we design the first end-to-end outlier detection service that integrates outlier-related services including automatic outlier detection, outlier summarization and explanation, human-guided outlier detection refinement within one integrated outlier discovery paradigm. Experimental studies including performance evaluation and user studies conducted on benchmark outlier detection datasets and real world datasets confirm both the effectiveness and efficiency of the proposed approaches and systems.