WPI - Computer Science Department , PhD Defense Apiwat Ditthapron :"Mobile Paralinguistic Health Assessment from Speech: Energy-Efficient and Privacy-Preserving Neural Network Models"ron "

Wednesday, April 17, 2024
3:00 pm to 4:00 pm

Apiwat Ditthapron

PhD Candidate

WPI – Computer Science Department

 

Committee members:

Advisor: Prof. Emmanuel Agu, WPI – Computer Science Department

Co-advisor: Prof. Adam Lammert, WPI – Biomedical Engineering Department

Prof. Elke Rundensteiner, WPI – Computer Science Department

External member: Dr. Thomas Quatieri, MIT Lincoln Laboratory

 

Date: Wednesday, April 17th, 2024

Time: 3:00 p.m. – 4:00 p.m.

Location: Fuller Labs 141

 

 Abstract: 

 Speech is an effective biomarker for evaluating neurological disorders, such as Traumatic Brain Injury (TBI, 2% of the population), and depression (8.4% of the population). To alleviate healthcare burdens and reduce rehospitalization, passive speech monitoring through mobile devices offers a promising approach at scale, requiring minimal subject involvement compared to traditional methods that require active engagement and clinic visits. This dissertation addresses three major challenges in employing Deep Neural Networks (DNN) for continuous paralinguistic health assessment on smartphones: energy efficiency, adverse recording environments, and speaker privacy.

 DNN is a powerful method for speech analyses, but consumes significant energy. To enhance power utilization of DNN, a novel masking kernel is proposed to learn the most energy-efficient length and sampling rate of speech sample. Addressing the challenge of diverse recording environments, particularly in crowded spaces that patients may visit, requires the isolation of the target speaker's speech from other people's speech and crosstalk. We propose Target Speaker Isolation, which utilizes an N-vector speaker representation trained to follow unbounded normal distribution for each speaker cluster. Furthermore, ensuring speaker privacy, including protection of biometric and linguistic content from unauthorized access, is crucial. An adversarial pruning technique is proposed for extracting privacy-preserving speech features on smartphones.

 

 

Audience(s)

DEPARTMENT(S):

Computer Science