Computer Science Department, PhD Research Qualifier . Qian Wang "Bridging Technical Analysis and Scientometric Insights in Biomedical Funding with HGNNs Enhanced Calibration" "

Monday, February 17, 2025
11:30 am to 12:30 pm

 

Qian Wang

PhD Student

WPI – Computer Science

 

Monday, February 17, 2025

Time: 11:30 am – 12:30 pm

Zoom: https://wpi.zoom.us/j/5222416480

 

Committee Members :

Advisor: Prof. Xiaozhong Liu

Reader: Prof. Xiangnan Kong 

Abstract : 

The concentration of research funding in a small segment of investigators and institutions (i.e., “the biomedical elite”) has been well established. Moreover, funding inequality persists despite deliberate efforts by funding agencies to counter it. To gain a deeper understanding of how social capital might be driving this phenomena, we collected publicly available data on the transition of over 11,000 National Institutes of Health Mentored Career Awardees (K01, K08, K23) from time of their K award until their first R01-equivalent award (if any). Using data from PubMed and other publicly available sources, we constructed a “heterogeneous scholarly graph” to represent the time-varying relationship between MK awardees, the quality and quantity of their work over their careers, their social capital (ties to influential people and institutions), and their ultimate success in obtaining R01-equivalent funding. We formulated and tested predictors of  ’K to R’ success in a graph neural network (GNN) model. In this paper we describe a novel process for calibrating a GNN model to improve its predictive accuracy called Heterogeneous Graph Calibration GNN (HGCGNN)– i.e., to align model predictions with observed outcomes. After assigning a measure of quality to each node (i.e., the scholarly objects of interest, such published articles, journals, institutional affiliations, coauthors), we derived a feature subgraph for each node. Next, the quality and subgraphs of all neighboring nodes were concatenated to the target node. In order to calibrate the prediction of the ’K to R’ existence.

Additionally, in the subgraph construction phase, we considered the influence of highly diverse neighbors on quality, calculating ‘K to R’ prediction accuracy and node subgraph feature uniqueness to enhance subgraph representation. Our model simultaneously maintains accuracy and data efficiency.. We conducted empirical experiments to validate the effectiveness of our model, demonstrating its consistent achievement of state-of-the-art calibration results across various graph datasets under different GNN backbones. Thus, the GNN confidence calibration improved the accuracy of our K to R prediction model. This will facilitate research to better understand the role social capital plays in the distribution of NIH funding.

Audience(s)

Department(s):

Computer Science