Data Science | Worcester Polytechnic Institute

Undergraduate Courses

DS 1010. Data Science I: Introduction to Data Science

Cat I (offered at least 1x per Year).
This course provides an introduction to the core concepts in Data Science. It covers a broad range of methodologies for working with and making informed decisions based on real-world data. Core topics introduced in this course include basic statistics, data exploration, data cleaning, data visualization, business intelligence, and data analysis. Students will utilize various techniques and tools to explore, understand and visualize real-world data sets from various domains and learn how to communicate data results to decision makers.
Recommended Background: None

DS 2010. Data Science II: Statistical Modeling and Analysis

Cat I (offered at least 1x per Year).
This course focuses on model- and data-driven approaches in Data Science. It covers methods from applied statistics, optimization, and machine learning to analyze and make predictions and inferences from real-world data sets. Topics covered in this course include a brief overview of statistics and linear algebra, followed by introductory machine learning methods such as linear and nonlinear regression, classification, decision trees, and dimension reduction techniques. Data exploration, data cleaning, feature engineering, and the bias-variance tradeoff will also be covered. Students will utilize various techniques and tools to explore and understand real-world data sets from various domains.
Recommended Background: Data science basics equivalent to DS 1010, applied statistics and regression equivalent to MA 2611 and MA 2612, and the ability to write computer programs in a scientific language equivalent to a CS programming course at the CS 1000 or CS 2000 level are assumed.

DS 3010. Data Science III: Computational Methods

Cat I (offered at least 1x per Year).
This course covers a broad range of computational methods to make informed decisions on large and/or high-dimensional data sets following the data science pipeline. Core topics include collecting data via APIs, processing and managing large-scale data, cloud computing, and applying machine learning and deep learning toolkits to extract insights. The goal is to aid decision-making in different domains. Students will learn these skills by working on projects using real-world data sets.
Recommended Background: Data science basics equivalent to DS 1010, and data analysis principles and modeling equivalent to DS 2010, knowledge of basic statistics equivalent to (MA 2611 and MA 2612), and the ability to program equivalent to (CS 1004 or CS 1101 or CS 1102) and (CS 2102, CS 2103 or CS 2119), as well as understanding of databases equivalent to (CS 3431 or MIS 3720) are assumed.

DS 4099. Special Topics in Data Science

Cat III (offered at discretion of dept/prgm).
Instances of this course will explore advanced and emerging topics in Data Science that are not covered by the current regular Data Science offerings. Content and format will vary to suit the interests and needs of the faculty and students. This course may be repeated by students for credit as topics change.

Graduate Courses

DS 5006. Machine Learning for Engineering and Science Applications

Cat I.
This course surveys the application of data science (DS) and machine learning (ML) to problems arising in engineering and the sciences. While DS and ML have profoundly affected domains such as image understanding and natural language processing, ML has seen comparatively less impact in chemistry, physics, chemical engineering, electrical engineering, and many other important application domains. Topics covered will include predictive modeling, feature engineering, and model assessment, with a particular focus on the small-data limit. We will analyze and apply algorithms with wide applicability in engineering and sciences including classic techniques such as multiple linear regression and random forests, and state-of-the-art techniques such as deep neural networks.
Recommended Background: The intention is for the class to be accessible to a wide audience in disciplines outside of Computer Science and Data Science, though some basic background topics such as statistics or linear algebra, and the ability to learn Python programming at a basic level would be helpful.

DS 501. Introduction to Data Science

.
Introduction to Data Science provides an overview of Data Science, covering a broad selection of key challenges in and methodologies for working with big data. Topics to be covered include data collection, integration, management, modeling, analysis, visualization, prediction and informed decision making, as well as data security and data privacy. This introductory course is integrative across the core disciplines of Data Science, including databases, data warehousing, statistics, data mining, data visualization, high performance computing, cloud computing, and business intelligence. Professional skills, such as communication, presentation, and storytelling with data, will be fostered. Students will acquire a working knowledge of data science through hands-on projects and case studies in a variety of business, engineering, social sciences, or life sciences domains. Issues of ethics, leadership, and teamwork are highlighted.

DS 5900. Data Science Internship

.
The internship is an elective-credit option designed to provide an opportunity to put into practice the principles studied in previous Data Science courses. Internships will be tailored to the specific interests of the student. Each internship must be carried out in cooperation with a sponsoring organization, generally from off campus and must be approved and advised by a core faculty member in the Data Science program. The internship must include proposal, design and documentation phases. Following the internship, the student will report on his or her internship activities in a mode outlined by the supervising faculty member. Students are limited to counting a maximum of 3 internship credits towards their degree requirements for the M.S. degree in Data Science. We expect a full-time graduate student to take on only part-time (20 hours or less of) internship work during the regular academic semester, while a full-time internship of 40 hours per week is appropriate during the summer semester as long as the student does not take a full class load at the same time. Internship credit cannot be used towards a certificate degree in Data Science. The internship may not be completed at the students current place of employment.

DS 595. Special Topics in Data Science

.
Special Topics in Data Science is course offering that will cover a topic of current interest in detail. This serves as a flexible vehicle to provide a one-time offering of topics of current interest as well as to offer new topics before they are made into a permanent course.

DS 596. Independent Study

.
Independent Study, as the name suggests, is a course that allows a student to study a chosen topic in Data Science under the guidance of a faculty member affiliated with the Data Science program. The student must produce a written report to satisfy the course requirement.

DS 597. Directed Research

.
Directed Research study, conducted under the guidance of a faculty member affiliated with the Data Science Program, investigates the challenges and techniques central to data science, and aims to develop novel approaches and techniques towards solving these challenges. The student who chooses this course must produce a written report to fulfil the course requirement.

DS 598. Graduate Qualifying Project

.
This 3-credit graduate qualifying project, done in teams, can be taken a second time for credit with permission by the instructor, up to a total of 6 credits. The project is to be carried out in cooperation with a sponsor or industrial partner. It must be overseen by a faculty member affiliated with the Data Science Program. This offering integrates theory and practice of Data Science, and includes the utilization of tools and techniques acquired in the Data Science Program. In addition to a written report, this project must be presented in a formal presentation to faculty of the Data Science program and sponsors. Professional development skills, such as communication, teamwork, leadership, and collaboration, along with storytelling, will be practiced.

DS 599. Master's Thesis in Data Science

.
The Masters Thesis in Data Science consists of a research and development project worth a minimum of 9 graduate credit hours and is advised by a faculty member affiliated with the Data Science Program. A thesis proposal must be approved by the DS Program Review Board and the students advisor, before the student can register for more than three thesis credits. The student must satisfactorily complete a written thesis document, and present the results to the DS faculty in a public presentation.

DS 699. Dissertation Research.

.
Intended for doctoral students admitted to candidacy wishing to obtain research credit toward their dissertations.