Data Science M.S. Thesis Presentation by Lily Robin O'Leary Amadeo

DEPARTMENT(S): 
July 24, 2015

Lily was recruited by Capital One and now resides in Washington, D.C. where her name plate reads DATA SCIENTIST.

Lily presented her M.S. Thesis to on July 28th, 2015, after completing a two year masters program in just one year. Her presentation was a roaring success. Lily had been awarded the Dean of Arts and Science Data Science fellowship for fall 2014 and a Data Science Teaching Assistantship position for spring 2015. We wish her all the best !

Title: Large Scale Matrix Completion and Recommender Systems

Abstract:
The goal of this thesis is to extend the theory and practice of matrix completion algorithms, and how they can be utilized, improved, and scaled up to handle large data sets. Matrix completion involves predicting missing entries in real-world data matrices using the modeling assumption that the fully observed matrix is low-rank. Low-rank matrices appear across a broad selection of domains, and such a modeling assumption is similar in spirit to Principal Component Analysis. Our focus is on large scale problems, where the matrices have millions of rows and columns. In this thesis we provide new analysis for the convergence rates of matrix completion techniques using convex nuclear norm relaxation. In addition, we validate these results on both synthetic data and data from two real-world domains (recommender systems and Internet tomography). The results we obtain show that with an empirical, data-inspired understanding of various parameters in the algorithm,  this matrix completion problem can be solved more efficiently than some previous theory suggests, and therefore can be extended to much larger problems with greater ease.

Adviser: Randy Paffenroth
Reader: Andy Trapp