Data Science | MS Thesis Defense | Quincy Hershey
10:00 am to 12:00 pm
United States
Quincy Hershey
MS Thesis Defense
Friday, April 14, 2023
10:00am – 12:00pm EST
Unity Hall 471 Conference Room
https://wpi.zoom.us/my/rcpaffenroth
Committee:
Professor Randy Paffenroth
Professor Seyed Zekavat
Title:
Exploring Neural Network Structure through Iterative Neural Networks: Connections to Dynamical Systems
Abstract:
The study of neural networks often places a heavy emphasis on structural hyper- parameters with the underlying architecture denoted as a series of nested functions within a rigid architecture. The structure and depth of a given network are typically specified as hyperparameters including the number and size of layers within a given model in a task dependent manner. In the process of exploring network architectures this thesis also examines the roles and relationship of network depth and sparsity in determining performance. The concept of aspect ratio is introduced, as a central feature governing the relationship between the depth and width of the combined network weight space. Consideration is given toward whether network structure in more traditional architectures serves as a proxy for the roles of depth, aspect ratio and sparsity. After demonstrating the traditional feed forward multi-layer perceptron (MLP) neural network architecture to be a special case of a recurrent neural network (RNN), the influence of these factors on network performance is examined from the alternate perspective of an RNN architecture. Recurrent neural networks contain clear commonalities with dynamical systems theory where iterative structures are prominently used in place of nested functions. While capable of replicating the traditional MLP architecture as a specific case, the RNN exists as a more generalized format of neural networks. Throughout this research, the problem sets employed in benchmarking model performance include typical MNIST based visual performance tasks as well as derived datasets representative of anomaly detection sequence and memory tasks. Among these comparisons, the relative performance of Long Short-Term Memory (LSTM) and MLP class models are benchmarked versus comparatively less-structured sparse RNN models. Lastly, sparse RNN’s are considered as a possible mechanism for gauging problem set difficulty.