PUBH7475

Download as PDF

PUBH 7475 - Statistical Learning and Data Mining (3 Cr.)

School of Public Health - Adm (11162) TPUB - School of Public Health

Course description

The subject of this course is closely related to machine learning and data science, with an emphasis on statistical aspects/views. This course will introduce various statistical/computational techniques for supervised learning and unsupervised learning. Topics to be covered include basic concepts (such as training versus test errors, cross-validation, bias-variance trade-off), penalized/regularized regression, linear discriminant analysis, tree-structured classifiers, neural networks, support vector machines, classifier ensembles (such as bagging and boosting), unsupervised learning (dimension reduction, clustering analysis, network analysis). These techniques can be applied in many fields, such as business and bioinformatics/computational biology.

prereq: Statistics at the minimum level of PUBH 7405–7406 or equivalent (e.g., Stat 5303), preferably at a higher level, or with permission of the instructor; familiarity with programming in R (or Python if you are willing to learn by yourself)

Minimum credits

3

Maximum credits

3

Is this course repeatable?

No

Grading basis

OPT - Student Option

Lecture

Fulfills the writing intensive requirement?

No

Typically offered term(s)

Every Spring