Data Science | MS Thesis Defense | Quincy Hershey

Friday, April 14, 2023
10:00 am to 12:00 pm
Location

United States

Floor/Room #
471 Conference Room

Quincy Hershey

MS Thesis Defense

Friday, April 14, 2023

10:00am – 12:00pm EST

Unity Hall 471 Conference Room

https://wpi.zoom.us/my/rcpaffenroth

 

Committee:

Professor Randy Paffenroth

Professor Seyed Zekavat

Title:

Exploring Neural Network Structure through Iterative Neural Networks: Connections to Dynamical Systems



Abstract:

The study of neural networks often places a heavy emphasis on structural hyper- parameters with the underlying architecture denoted as a series of nested functions within a rigid architecture. The structure and depth of a given network are typically specified as hyperparameters including the number and size of layers within a given model in a task dependent manner. In the process of exploring network architectures this thesis also examines the roles and relationship of network depth and sparsity in determining performance. The concept of aspect ratio is introduced, as a central feature governing the relationship between the depth and width of the combined network weight space. Consideration is given toward whether network structure in more traditional architectures serves as a proxy for the roles of depth, aspect ratio and sparsity. After demonstrating the traditional feed forward multi-layer perceptron (MLP) neural network architecture to be a special case of a recurrent neural network (RNN), the influence of these factors on network performance is examined from the alternate perspective of an RNN architecture. Recurrent neural networks contain clear commonalities with dynamical systems theory where iterative structures are prominently used in place of nested functions. While capable of replicating the traditional MLP architecture as a specific case, the RNN exists as a more generalized format of neural networks. Throughout this research, the problem sets employed in benchmarking model performance include typical MNIST based visual performance tasks as well as derived datasets representative of anomaly detection sequence and memory tasks. Among these comparisons, the relative performance of Long Short-Term Memory (LSTM) and MLP class models are benchmarked versus comparatively less-structured sparse RNN models. Lastly, sparse RNN’s are considered as a possible mechanism for gauging problem set difficulty.

 

Audience(s)

DEPARTMENT(S):

Data Science
Contact Person
Kelsey Briggs

PHONE NUMBER: