Data Science | MS Thesis | Sirshendu Ganguly | Worcester Polytechnic Institute

Data Science

MS Thesis

Sirshendu Ganguly

April 24, 2023

9:30am – 10:30am

Beckett Conference Room, Fuller Labs

Zoom Link: https://wpi.zoom.us/j/98061086576

COMMITTEE MEMBERS

Advisor: Fabricio Murai, Assistant Professor, Data Science

Reader: Yanhua Li, Associate Professor, Data Science

TITLE

Leveraging Multi-task Learning Graph Neural Networks for Improving Fraud Detection

ABSTRACT

This thesis explores the challenges of detecting fraudulent activities such as money laundering detection in the financial ecosystems and forged review detection in e-commerce websites. One of the major differences between fraud detection and other classification problems is the class imbalance ratio. Class imbalance is a phenomenon that occurs when the number of examples in each class of a dataset is not evenly distributed, for example, the ratio between the number of illicit transactions and that of licit transactions in a fraud detection problem is very small. In this thesis, we explored three graph datasets commonly used for benchmarking fraud detection techniques, the Elliptic dataset, the fraud Amazon dataset, and the fraud Yelp dataset. Our goal is to increase the raw feature set by node embeddings generated by complementary tasks such as link prediction, and link classification before the final classification task. The current limitations of existing tools in accurately estimating fraud, along with the difficulties associated with detecting fraudulent activities in general, are discussed. First, we use interrelated tasks such as link prediction, and link classification to generate node embeddings that are added to the raw features to capture graph topological information, which is then used for training a supervised machine learning algorithm to detect fraudulent nodes.

Data Science | MS Thesis | Sirshendu Ganguly

DEPARTMENT(S):

PHONE NUMBER:

EMAIL: