Mathematical Sciences Department PhD Dissertation Defense - Xiaohui Chen "Novel Statistical Methods for Aggregating Correlated and Missing Data with Applications to Chronic Disease Research" (UH 420)

Friday, August 4, 2023
12:00 p.m. to 2:00 p.m.
Location
Floor/Room #
420
Preview

phd defense flyer

Mathematical Sciences Department

PhD Dissertation Defense

Xiaohui Chen, PhD Candidate

Friday, August 4, 2023

12:00 pm - 2:00 pm

Unity Hall 420

Title: Novel Statistical Methods for Aggregating Correlated and Missing

Data with Applications to Chronic Disease Research

Abstract:

Information-aggregation methods are crucial for identifying risk

factors for chronic diseases, analyzing treatment effects, and handling missing

data problems. However, significant gaps persist in the literature, including

the lack of signal-adaptive methods for summary statistics, inadequate study

of the correlation-robustness properties of hypothesis-testing methods, and

insufficient methods for large missing rates. In this dissertation, we aim to

address these gaps and advance relevant statistical methodology.

In the first part, we propose a new signal-adaptive analysis pipeline to

address unknown signal patterns using the omnibus thresholding Fisher’s

method (oTFisher). The oTFisher remains robustly powerful over various

patterns of genetic effects. Its adaptive thresholding can be applied to estimate

important single nucleotide polymorphisms (SNPs) contributing to the overall

significance of the given SNP set. Efficient calculation algorithms are

developed to control the type I error rate, which accounts for the linkage

disequilibrium among SNPs. Extensive simulations show that the oTFisher

has robustly high power and provides higher balanced accuracy in screening

SNPs than the traditional Bonferroni and FDR procedures. We apply the

oTFisher to study the genetic association of genes and haplotype blocks of the

bone density-related traits using the GWAS summary data of the Genetic

Factors for Osteoporosis Consortium. The oTFisher identifies more novel and

literature-reported genetic factors than existing p-value combination methods.

Next, we provide theoretical analyses examining the correlation-

robustness properties of hypothesis-testing methods in analyzing correlated

data. We focus specifically on two classical tests - the minimum P-value

(minP) and the Simes tests. Our investigation delves into the tail probabilities

of the minP and the Simes tests under the Gaussian mean model, considering

an arbitrary correlation matrix. Our study reveals that both tests demonstrate

asymptotic robustness to any non-perfect correlations. These findings hold

significant practical implications, particularly when calculating extreme tail

probabilities, as seen in scenarios requiring stringent type I error control in

large-scale data analysis. Utilizing the approximation by the probability under

independence could significantly expedite computation for analyzing large

datasets.

In the third part of this research, we study the missing data problems with

high missing rates across different time points in pulmonary arterial

hypertension. The COVID-19 pandemic introduced new challenges, such as

high missing rates and unverifiable missing assumptions, that affect the

measurement of drug effects. Multiple imputation methods are systemically

compared to address the high missing rate issue based on remotely collected

data (e.g., actigraphy data) under a simulation study. Four scenarios are

considered in the simulation: missingness due to missing at random, adverse

events, lack of efficacy, and a mixture case. We demonstrate that traditional

parametric methods in the Bayesian framework have a high relative bias with

a 40% missing rate. However, adding remotely available data related to the

primary outcome and imputing the missingness by the best guess of reasons

can lead to smaller relative biases.

Dissertation Committee:

Dr. Zheyang Wu, WPI (Advisor)

Dr. Qingshuo Song, WPI

Dr. Fangfang Wang, WPI

Dr. Dali Zhou, U.S. Food and Drug Administration

Dr. Jian Zou, WPI

Audience(s)

Department(s):

Mathematical Sciences