Walter's research is primarily centered around trustworthy AI, such as removing harmful behaviors and inaccurate correlations from models, making models fairer, AI safety, and developing robust measures of confidence and uncertainty to determine when a model's prediction can be trusted. His work has a particular focus on efficient and "minimally invasive" techniques to improve the robustness of models without harming their generalizability. His work also typically includes multi-modal foundation models, generative AI, and applying machine learning and AI for healthcare. His work has been featured in top academic venues such as NeurIPS, ICLR, AAAI, ACL, and CIKM. Walter has served as an organizer for the Time Series for Health workshop, a Senior Area Chair for the Conference on Health, Inference, and Learning (CHIL), and an Area Chair for NeurIPS. Before joining WPI's Computer Science Department, Walter was a postdoctoral assistant in the HealthyML lab at MIT. He received a PhD in Data Science from Worcester Polytechnic Institute in 2023.
Walter's research is primarily centered around trustworthy AI, such as removing harmful behaviors and inaccurate correlations from models, making models fairer, AI safety, and developing robust measures of confidence and uncertainty to determine when a model's prediction can be trusted. His work has a particular focus on efficient and "minimally invasive" techniques to improve the robustness of models without harming their generalizability. His work also typically includes multi-modal foundation models, generative AI, and applying machine learning and AI for healthcare. His work has been featured in top academic venues such as NeurIPS, ICLR, AAAI, ACL, and CIKM. Walter has served as an organizer for the Time Series for Health workshop, a Senior Area Chair for the Conference on Health, Inference, and Learning (CHIL), and an Area Chair for NeurIPS. Before joining WPI's Computer Science Department, Walter was a postdoctoral assistant in the HealthyML lab at MIT. He received a PhD in Data Science from Worcester Polytechnic Institute in 2023.
Scholarly Work
Select Publications
Learning under Temporal Label Noise.
Sujay Nagaraj, Walter Gerych, Sana Tonekaboni, Anna Goldenberg, Berk Ustun, Thomas Hartvigsen. ICLR, 2025.
Mapping from Meaning: Addressing the Miscalibration of Prompt-Sensitive Language Models.
Kyle Cox, Jiawei Xu, Yikun Han, Rong Xu, Tianhao Li, Chi-Yang Hsu, Tianlong Chen, Walter Gerych, Ying Ding. AAAI, 2025.
The Surprising Effectiveness of Infinite-Width NTKs for Characterizing and Improving Model Training.
Joshua DeOliveira, Walter Gerych, Elke Rundensteiner. AAAI, 2025.
BENDVLM: Test-Time Debiasing of Vision-Language Embeddings.
Walter Gerych, Haoran Zhang, Kimia Hamidieh, Eileen Pan, Maanas Sharma, Thomas Hartvigsen, Marzyeh Ghas-
semi. NeurIPS, 2024.
Identifying Implicit Social Biases in Vision-Language Models.
Kimia Hamidieh, Haoran Zhang, Walter Gerych, Thomas Hartvigsen, Marzyeh Ghassemi. AIES, 2024.
TAXI: Evaluating Categorical Knowledge Editing for Language Models.
Derek Powell, Walter Gerych, Thomas Hartvigsen. ACL, 2024.
Who Knows the Answer? Finding the Best Model & Prompt Using Confidence-Based Search.
Walter Gerych, Yara Rizk, Vatche Isahagian, Vinod Muthusamy, Evelyn Duesterwald, Praveen Venkateswaran.
AAAI, 2024.
Amalgamating Multi-Task Models with Heterogeneous Architectures.
Jidapa Thadajarassiri, Walter Gerych, Xiangnan Kong, Elke Rundensteiner. AAAI, 2024.
Debiasing Pretrained Generative Models By Uniformly Sampling Semantic Attributes.
Walter Gerych, Kevin Hickey, Luke Buquicchio, Kavin Chandrasekaran, Abdulaziz Alajaji, Emmanuel Agu, Elke
Rundensteiner. NeurIPS, 2023.
Stabilizing Adversarial Training for Generative Networks.
Walter Gerych*, Kevin Hickey* (Joint First Author), Thomas Hartvigsen, Luke Buquicchio, Abdulaziz Alajaji, Kavin
Chandrasekaran, Hamid Mansoor, Emmanuel Agu, Elke Rundensteiner. IEEE Big Data MLDB, 2023.
Population-Level Visual Analytics of Smartphone Sensed Health Using Community Phenotypes.
Hamid Mansoor, Walter Gerych, Abdulaziz Alajaji, Luke Buquicchio, Kavin Chandrasekaran, Emmanuel Agu, Elke
Rundensteiner. IEEE ICHI, 2023.
Adversarial Human Context Recognition: Evasion Attacks and Defenses.
Abdulaziz Alajaji, Walter Gerych, Kavin Chandrasekaran, Luke Buquicchio, Emmanuel Agu, Elke Rundensteiner.
IEEE COMPSAC, 2023.
Knowledge Amalgamation for Multi-Label Classification via Label Dependency Transfer.
Jidapa Thadajarassiri, Thomas Hartvigsen, Walter Gerych, Xiangnan Kong, Elke Rundensteiner. AAAI, 2023.
Domain Adaptation Methods for Lab-to-Field Human Context Recognition.
Abdulaziz Alajaji, Walter Gerych, Luke Buquicchio, Kavin Chandrasekaran, Hamid Mansoor, Emmanuel Agu, Elke
Rundensteiner. Sensors 23(6), 2023.
INPHOVIS: Interactive Visual Analytics For Smartphone-Based Digital Phenotyping.
Hamid Mansoor, Walter Gerych, Abdulaziz Alajaji, Luke Buquicchio, Kavin Chandrasekaran, Emmanuel Agu, Elke
Rundensteiner, Angela Incollingo Rodriguez. Visual Informatics, 2023.
HAR-CTGAN: A Mobile Sensor Data Generation Tool for Human Activity Recognition.
Joshua DeOliveira, Walter Gerych, Aruzhan Koshkarova, Elke Rundensteiner, Emmanuel Agu. IEEE Big Data 4th
Special Session on HealthCare Data, 2022.
Text Generation to Aid Depression Detection: A Comparative Study of Conditional Sequence GANs.
ML Tlachac, Walter Gerych, Kratika Agrawal, Benjamin Litterer, Nicholas Jurovich, Saitheeraj Thatigotla, Jidapa
Thadajarassiri, Elke Rundensteiner. IEEE Big Data 4th Special Session on HealthCare Data, 2022.
Positive Unlabeled Learning with a Sequential Selection Bias.
Walter Gerych, Thomas Hartvigsen, Luke Buquicchio, Emmanuel Agu, Elke Rundensteiner. SDM, 2022.
Robust Recurrent Classifier Chains for Multi-Label Learning with Missing Labels.
Walter Gerych, Thomas Hartvigsen, Luke Buquicchio, Emmanuel Agu, Elke Rundensteiner. CIKM, 2022.
Stop&Hop: Early Classification of Irregular Time Series.
Thomas Hartvigsen, Walter Gerych, Jidapa Thadajarassiri, Xiangnan Kong, Elke Rundensteiner. CIKM, 2022.
Recovering The Propensity Score From Biased Positive Unlabeled Data.
Walter Gerych, Thomas Hartvigsen, Luke Buquicchio, Emmanuel Agu, Elke Rundensteiner. AAAI, 2022. Oral Spot-
light.
On Detecting COVID-Risky Behavior from Smartphones.
Thomas Hartvigsen*, Walter Gerych* (Joint First Author), Marzyeh Ghassemi. Workshop on Epidemiology meets
Data Mining and Knowledge Discovery, KDD, 2022.
Triplet-based Domain Adaptation (Triple-DARE) for Lab-to-Field Human Context Recognition.
Abdulaziz Alajaji, Walter Gerych, Luke Buquicchio, Kavin Chandrasekaran, Hamid Mansoor, Emmanuel Agu, Elke
Rundensteiner. IEEE PerCom Industry Track, 2022.
Recurrent Bayesian Classifier Chains for Exact Multi-Label Classification.
Walter Gerych, Tom Hartvigsen, Luke Buquicchio, Emmanuel Agu, Elke Rundensteiner. NeurIPS, 2021.
GAN For Generating User-Specific Human Activity Data From An Incomplete Training Corpus.
Walter Gerych, Harrison Kim, Joshua DeOliveira, MaryClare Martin, Luke Buquicchio, Kavin Chandrasekaran,
Abdulaziz Alajaji, Hamid Mansoor, Emmanuel Agu, Elke Rundensteiner. IEEE Big Data 4th Special Session on
HealthCare Data, 2021.
Variational Open Set Recognition.
Luke Buquicchio, Walter Gerych, Kavin Chandrasekaran, Abdulaziz Alajaji, Hamid Mansoor, Thomas Hartvigsen,
Elke Rundensteiner, Emmanuel Agu. IEEE Big Data, 2021.
Local Geometry Preserving Deep Networks For Featurizing High-Dimensional Datasets.
Walter Gerych, Jessica Bader, Declan Nelson, Thalia Chai-Zhang, Luke Buquicchio, Abdulaziz Alajaji, Kevin Chan-
drasekaran, Emmanuel Agu, Elke Rundensteiner. IEEE ICMLA, 2021.
Few-Shot Classification for Human Context Recognition Using Smartphone Data Traces.
Luke Buquicchio, Walter Gerych, Abdulaziz Alajaji, Kavin Chandrasekaran, Hamid Mansoor, Emmanuel Agu, Elke
Rundensteiner. IEEE ICMLA, 2021.
Visual Analytics of Smartphone-Sensed Human Behavior and Health.
Hamid Mansoor, Walter Gerych, Abdulaziz Alajaji, Luke Buquicchio, Kavin Chandrasekaran, Emmanuel Agu, Elke
Rundensteiner. IEEE Computer Graphics and Applications, 2021.
Smartphone Health Biomarkers: Positive Unlabeled Learning of In-the-Wild Contexts.
Abdulaziz Alajaji, Walter Gerych, Luke Buquicchio, Kavin Chandrasekaran, Emmanuel Agu, Elke Rundensteiner.
Pervasive Computing, 2021.
Measuring Group Advantage: A Comparative Study of Fair Ranking Metrics.
Caitlin Kuhlman*, Walter Gerych* (Joint First Author), Elke A. Rundensteiner. AIES, 2021.
PLEADES: Population Level Observation of Smartphone Sensed Symptoms for In-the-Wild Data.
Hamid Mansoor, Walter Gerych, Abdulaziz Alajaji, Luke Buquicchio, Kavin Chandrasekaran, Emmanuel Agu, Elke
Rundensteiner. VISIGRAPP, 2021.
Complex Activity Recognition Using Topic Models for Feature Generation from Wearable Sensor Data.
Kavin Chandrasekaran, Walter Gerych, Luke Buquicchio, Abdulaziz Alajaji, Emmanuel Agu, Elke Rundensteiner.
SMARTCOMP, 2021.
BurstPU: Classification of Weakly Labeled Datasets with Sequential Bias.
Walter Gerych, Luke Buquicchio, Kavin Chandrasekaran, Abdulaziz Alajaji, Hamid Mansoor, Aidan Murphy, Em-
manuel Agu, Elke Rundensteiner. IEEE Big Data, 2020.
INTOSIS: Interactive Observation of Smartphone Inferred Symptoms for In-The-Wild Data.
Hamid Mansoor, Walter Gerych, Luke Buquicchio, Abdulaziz Alajaji, Kavin Chandrasekaran, Emmanuel Agu, Elke
Rundensteiner. IEEE Big Data, 2020.
DeepContext: Parameterized Compatibility-Based Attention CNN for Human Context Recognition.
Abdulaziz Alajaji, Walter Gerych, Kavin Chandrasekaran, Luke Buquicchio, Emmanuel Agu, Elke Rundensteiner.
ICSC, 2020.
ARGUS: Interactive Visual Analytics Framework for the Discovery of Disruptions in Bio-Behavioral Rhythms.
Hamid Mansoor, Walter Gerych, Luke Buquicchio, Kavin Chandrasekaran, Emmanuel Agu, Elke Rundensteiner.
EuroVis (Short Papers), 2020.
COMEX: Identifying Mislabeled Human Behavioral Context Data Using Visual Analytics.
Hamid Mansoor, Walter Gerych, Luke Buquicchio, Kavin Chandrasekaran, Emmanuel Agu, Elke Rundensteiner.
COMPSAC, 2019.
Classifying Depression in Imbalanced Datasets Using Autoencoder-Based Anomaly Detection.
Walter Gerych, Emmanuel Agu, Elke Rundensteiner. ICSC, 2019.