Research Experiences For Undergraduates (REU)

REU 2013 Projects

Analyzing Databases for Patterns in Customer Usage for GenomeQuest

Sponsor: Richard Resnick, CEO of GenomeQuest
Advisor: Professor Matthew Willyard

Anastasia Bergara    Alyssa Cuyjet    Matt Smith    Brittany Street   
Anastasia Bergara    Alyssa Cuyjet    Matt Smith    Brittany Street

Our project focuses on developing meaningful patterns from GenomeQuest's customer usage data files through the use of the Knowledge Discovery in Databases (KDD) process. We are looking for patterns within their customer usage keeping track of patterns among regions and types of clientele along with any other grouping. This involves extensive processing and transformation of the data to apply various data mining techniques to distinguish patterns. We are using multiple data mining techniques including logistic regression, decision trees, decision forests, clustering, association rules and neural networks. The procedure and application to our project of each technique is examined along with preliminary results.

Forecasting Latent Business Conditions Using Macroeconomic Factors and the Kalman Filter

Sponsor: Dr. Henri Fouda, and Dr. Ravi Shastri, Wellington Management
Advisor: Professor Marcel Blais

Rachel Gosch    Jerin Kurien    Sonia Mahop    Kimberly McCarty   
Rachel Gosch    Jerin Kurien    Sonia Mahop    Kimberly McCarty

Defining and measuring a financial concept such as the current state of the economy is a difficult task. The ability to reliably forecast business conditions to hedge against adverse future market movements poses an even greater challenge, particularly in portfolio management. We consider issues involved with this type of estimation, including the use of numerous interrelated macroeconomic factors and varying information flow frequencies, and develop an underlying structure for business conditions. Our model builds on the Aruoba-Diebold-Scotti Index, proven to be effective in tracking the state of the economy. We extend this framework to accurately forecast major shifts in the economy based on intermediate data. In addition, we hope to construct our model for application in data-sparse environments, notably emerging markets.

Analyzing Exome Sequencing Data to Detect Familial ALS Genes

Sponsor: Dr. John Landers, University of Massachusetts Medical School
Advisor: Professor Zheyang Wu

Eric Oh    Ryan Pyle    Michael Wingate    Katherine Young   
Eric Oh    Ryan Pyle    Michael Wingate    Katherine Young

Familial ALS (Amyotrophic Lateral Sclerosis) is an inheritable neurodegenerative disease affecting nerve cells in the brain and spinal cord. Our project is on statistical methodology development and data analysis for the most cutting-edge genetic data from exome sequencing and genome-wide association studies of familial ALS. After exploring over fifteen methods for analyzing this type of data, we have determined four methods, which we believe to be the most efficacious: C-alpha, Replication Based Testing, Joint-Rank Method, and Variable Threshold Method. We also aim to develop new methods based on Rank Truncated Product test. Combining biological prior information, we are now applying these methods to our data, and we hope to find new evidence of familial ALS gene associations.

Maintained by
Last modified: Jul 13, 2013, 14:41 EDT
[WPI] [Math] [Home]