Skip to main content

Data Science Ph.D. Dissertation Proposal | Title: Probabilistic Topic Models for Complex Activity Recognition and Generation | Kavin Chandrasekaran

Friday, December 02, 2022
2:00 am


Floor/Room #: 




Kavin Chandrasekaran
Ph.D. Dissertation Proposal

Friday, December 2, 2022
2:00 p.m. - 4:00 p.m.

Location: UH343

Advisor: Dr. Emmanuel Agu, Professor, WPI
Co-Advisor: Dr. Elke Rundensteiner, Professor, WPI
Committee Member: Dr. Nima Kordzadeh, Assistant Professor, WPI
External Member: Dr. Victor Robles, Applied Scientist in Machine Learning, Signify

Title: Probabilistic Topic Models for Complex Activity Recognition and Generation


Human activity recognition is important in many critical healthcare domains including patient health monitoring, rehabilitation monitoring, elder care, preventive care, and mental illness monitoring and intervention. There are three main types of ambulatory patient activities that need to be monitored: simple activities(standing, walking), activity transitions(sit to stand) and complex activities(going to the bathroom, cooking). Since prior work has researched recognition and monitoring simple activities extensively, this dissertation will focus on passive activity transition recognition and complex activity recognition using smartphone sensor data.

Monitoring the duration of transitions between physical activities such as sitting to standing can provide valuable clues for assessing the mobility, fall-risk and overall health of a patient. Change in a patient's daily activity routines can indicate a change in their health. Passively recognizing the types of complex activities of daily living performed by a patient from smartphone sensor data can provide insight into their physical and mental health. Transition detection is a challenging problem because transitions typically span a few seconds, which yields relatively few data samples for training machine learning models. The major challenge in complex activity recognition is that there is a lot of variability in the performance of complex activities. The simple activities that make up the complex activity can occur concurrently, in interleaved fashion and have varied ordering each time the same complex activity is performed, which presents a challenge to machine learning classifiers. For instance in the example of cooking, the number and order of constituent simple activities can vary based on the dish being prepared. Another challenge in complex activity recognition using machine learning is the scarcity of publicly available labeled datasets. 

Thus far, two models have been built to address activity transition recognition and complex activity recognition. For transitions recognition, a two stage approach was employed in which the transitions are detected first using a bi-directional gated recurrent unit layer with attention, and then transitions were classified using a rule-based classifier. To address the variability in complex activities, a topic model was used to capture the latent representations of the complex activities. The topic features were then used for complex activity classification. In rigorous evaluation, both proposed methods were able to outperform baselines including state-of-the-art models.

To complete this dissertation, two novel neural networks models are being proposed to improve complex activity recognition performance by i) using an enhanced topic modeling approach and ii) dataset augmentation with synthetic data. The new proposed complex activity recognition model will utilize a tree-structured neural topic modeling approach. Complex activities are hierarchical in nature, a structure that can be better represented by the tree-structured topic model. To generate synthetic complex activity data, two Generative Adversarial Networks(GAN) based models that will utilize topic models, are proposed. The first model will generate the simple activities that make up the complex activity and their corresponding durations. The second model will generate actual sensor data. The proposed complex activity recognition and synthetic data generation models will be evaluated and compared against the state-of-the-art baseline models.