Undergraduate Courses

DS 1010. DATA SCIENCE I: INTRODUCTION TO DATA SCIENCE

  • Cat. I This course provides an introduction to the core concepts in Data Science. It covers a broad range of methodologies for working with and making informed decisions based on real-world data. Core topics introduced in this course include basic statistics, data exploration, data cleaning, data visualization, business intelligence, and data analysis. Students will utilize various techniques and tools to explore, understand and visualize real-world data sets from various domains and learn how to communicate data results to decision makers. Recommended background: None

DS 2010. DATA SCIENCE II: MODELING AND DATA ANALYSIS

  • Cat. I This course focuses on model- and data-driven approaches in Data Science. It covers methods from applied statistics (regression), optimization, and machine learning to analyze and make predictions and inferences from real-world data sets. Topics introduced in this course include basic statistics (regression), analytics (explanatory and predictive), basics of machine learning (classification and clustering), eigen values and singular matrices, data exploration, data cleaning, data visualization, and business intelligence. Students will utilize various techniques and tools to explore and understand real-world data sets from various domains. Recommended background: Data science basics equivalent to DS 1010, applied statistics and regression equivalent to MA2611 and MA 2612, and the ability to write computer programs in a scientific language equivalent to a CS programming course at the CS 1000 or CS 2000 level are assumed.

DS 3001. FOUNDATIONS OF DATA SCIENCE

  • This course provides an introduction to the core ideas in Data Science. It covers a broad range of methodologies for working with and making informed decisions based on real-world data. Core topics introduced in this course include data collection, data management, statistical learning, data mining, data visualization, cloud computing, and business intelligence. Students will acquire experience with big data problems through hands-on projects using real-world data sets. Recommended background for this course includes statistics knowledge equivalent to MA 2611 and MA 2612, linear algebra equivalent to MA 2071, and the ability to program equivalent to (CS 1004 or CS 1101 or CS 1102) and (CS 2102 or CS 2119). This course does not fulfill Mathematics, Basic Science or Engineering Science/Design credits.

DS 3010. DATA SCIENCE III: COMPUTATIONAL DATA INTELLIGENCE

  • Cat. I This course introduces core methods in Data Science. It covers a broad range of methodologies for working with large and/or high-dimensional data sets to making informed decisions based on real-world data. Core topics introduced in this course include data collection through use cycle, data management of large-scale data, cloud computing, machine learning and deep learning. Students will acquire experience with big data problems through hands-on projects using real-world data sets. Recommended background: Data science basics equivalent to DS 1010, and data analysis principles and modeling equivalent to DS 2010, knowledge of basic statistics equivalent to (MA2611 and MA 2612), and the ability to program equivalent to (CS 1004 or CS 1101 or CS 1102) and (CS 2102, CS2103 or CS 2119), as well as understanding of databases equivalent to (CS3431 or MIS3720) are assumed.

DS 4433. BIG DATA MANAGEMENT AND ANALYTICS

  • Cat. I This course introduces the emerging techniques and infrastructures for big data management and analytics including parallel and distributed database systems, map-reduce, Spark, and NO-SQL infrastructures, data stream processing systems, scalable analytics and mining, and cloud-based computing. Query processing and optimization, access methods, and storage layouts developed on these infrastructures will be covered. Students are expected to engage in hands-on projects using one or more of these technologies. Recommended background: Knowledge in database systems at the level of CS4432, and programming experience are assumed.

Graduate Courses

DS 502. STATISTICAL METHODS FOR DATA SCIENCE

  • This course surveys the statistical methods most useful in data science applications. Topics covered include predictive modeling methods, including multiple linear regression, and time series; data dimension reduction; Discrimination and classification methods, clustering methods;and committee methods. Students will implement these methods using statistical software. Prerequisites: Statistics at the level of MA 2611 and MA2612 and linear algebra at the level of MA 2071.

DS 503. BIG DATA MANAGEMENT

  • Emerging applications in science and engineering disciplines generate and collect data at unprecedented speed, scale, and complexity that need to be managed and analyzed efficiently. This course introduces the emerging techniques and infrastructures developed for big data management including parallel and distributed database systems, map-reduce infrastructures, scalable platforms for complex data types, stream processing systems, and cloud-based computing. Query processing, optimization, access methods, storage layouts, and energy management techniques developed on these infrastructures will be covered. Students are expected to engage in hands-on projects using one or more of these technologies. Prerequisites: A beginning course in databases at the level of CS4432 or equivalent knowledge, and programming experience.

DS 504. BIG DATA ANALYTICS

  • Innovation and discoveries are no longer hindered by the ability to collect data, but the ability to summarize, analyze, and discover knowledge from the collected data in a scalable fashion. This course covers computational techniques and algorithms for analyzing and mining patterns in large-scale datasets. Techniques studied address data analysis issues related to data volume (scalable and distributed analysis), data velocity (high-speed data streams), data variety (complex, heterogeneous, or unstructured data), and data veracity (data uncertainty). Techniques include mining and machine learning techniques for complex data types, and scaleup and scale-out strategies that leverage big data infrastructures. Real-world applications using these techniques, for instance social media analysis and scientific data mining, are selectively discussed. Students are expected to engage in hands-on projects using one or more of these technologies. Prerequisites: A beginning course in databases and a beginning course in data mining, or equivalent knowledge, and programming experience.

DS 517. MATHEMATICAL FOUNDATIONS FOR DATA SCIENCE

  • The foci of this class are the essential statistics and linear algebra skills required for Data Science students. The class builds the foundation for theoretical and computational abilities of the students to analyze high dimensional data sets. Topics covered include Bayes’ theorem, the central limit theorem, hypothesis testing, linear equations, linear transformations, matrix algebra, eigenvalues and eigenvectors, and sampling techniques, including Bootstrap and Markov chain Monte Carlo. Students will use these techniques while engaging in hands-on projects with real data. Prerequisites: Some knowledge of integral and differential calculus is recommended.

DS 541. DEEP LEARNING

  • This course will offer a mathematical and practical perspective on artificial neural networks for machine learning. Students will learn about the most prominent network architectures including multi-layer feedforward neural networks, convolutional neural networks (CNNs), auto-encoders, recurrent neural networks (RNNs), and generative-adversarial networks (GANs). This course will also teach students optimization and regularization techniques used to train them -- such as back-propagation, stochastic gradient descent, dropout, pooling, and batch normalization. Connections to related machine learning techniques and algorithms, such as probabilistic graphical models, will be explored. In addition to understanding the mathematics behind deep learning, students will also engage in hands-on course projects. Students will have the opportunity to train neural networks for a wide range of applications, such as object detection, facial expression recognition, handwriting analysis, and natural language processing. Prerequisite: Machine Learning (CS 539), and knowledge of Linear Algebra (such as MA 2071) and Algorithms (such as CS 2223).

DS 598. GRADUATE QUALIFYING PROJECT

  • This 3-credit graduate qualifying project, typically done in teams, is to be carried out in cooperation with a sponsor or industrial partner. It must be overseen by a faculty member affiliated with the Data Science Program. This offering integrates theory and practice of Data Science, and should include the utilization of tools and techniques acquired in the Data Science Program. In addition to a written report, this project must be presented in a formal presentation to faculty of the Data Science program and sponsors. Professional development skills, such as communication, teamwork, leadership, and collaboration, along with storytelling, will be practiced. Prerequisite: DS 501, completion of at least 24 credits of the DS degree, or consent of the instructor.