CSC 380: Principles of Data Science (Spring 2023)

This course introduces students to principles of data science that are necessary for computer scientists to make effective decisions in their professional careers. A number of computer science sub-disciplines now rely on data collection and analysis. For example, computer systems are now complicated enough that comparing the execution performance of two different programs becomes a statistical estimation problem rather than a deterministic computation. This course teaches students the basic principles of how to properly collect and process data sources in order to derive appropriate conclusions from them. The course has three main components: data analysis, machine learning, and a project where students apply the concepts discussed in class to a substantial open-ended problem.

Logistics info

Time and venue: TuTh 2-3:15pm, M. Pacheco ILC 130

Syllabus

Piazza link Access code: wildcats

Gradescope Entry code: BBRJBW (NB: Please make sure your gradescope email address is the same as the one you have on D2L.)

D2L course webpage: lecture video recordings will be at “UA Tools” -> “Zoom” (NB: Zoom links are for recordings only and are not for live-streaming lectures.)

We will be using Piazza to make important announcements and do Q&As. Some general rules:

Course staff

Instructors: Chicheng Zhang and Kyoungseok Jang; Emails: {chichengz, ksajks} at arizona.edu

Teaching assistants: Saiful Islam Salim, Yinan Li, and Sayyed Faraz Mohseni; Emails: {saifulislam, yinanli, mohseni} at arizona.edu

Office Hours:

Chicheng Zhang: Tuesdays 3:30-4:30pm, Gould-Simpson 720 (before Feb 28)

Kyoungseok Jang: Tuesdays 3:30-4:30pm, Gould-Simpson 732 (after Feb 28)

Saiful Islam Salim: Wednesdays 10-11am, Gould-Simpson 856

Tugay Bilgis: Thursday 10-11am, Gould-Simpson 942

Yinan Li: Mondays 12:45- 1:45pm, Gould-Simpson 856

Sayyed Faraz Mohseni: Fridays 12-1pm, Gould-Simpson 837

Textbook

There is no single designated textbook for this course. Much of the course materials and assigned readings will be based on the following books:

WJ: Watkins, J., “An Introduction to the Science of Statistics: From Theory to Implementation”

MK: Murphy, K. “Machine Learning: A Probabilistic Perspective.” MIT press, 2012 (accessible online via UA library)

WL: Wasserman, L. “All of Statistics: A Concise Course in Statistical Inference.” Springer, 2004 (accessible online via UA library)

Other useful resources

Machine learning courses at UA

CSC 535 Probabilistic Graphical Models by Kobus Barnard

ISTA 457/INFO 557 Neural Networks by Steven Bethard

CSC 665 Online Learning and Multi-armed Bandits by Kwang-Sung Jun

INFO 521 Introduction to Machine Learning by Clayton Morrison

CSC 665 Advanced Topics in Probabilistic Graphical Models by Jason Pacheco

CSC 580 Principles of Machine Learning by Carlos Scheidegger

MIS 601 Statistical Foundations of Machine Learning by Junming Yin

MATH 574M Statistical Machine Learning by Helen Zhang

CSC 696H: Topics in Reinforcement Learning Theory by Chicheng Zhang