Privacy in Statistics and Machine Learning
In this course we will learn how common anonymisation techniques are open to re-identification attacks and how to describe, quantify, and protect against private data leakage using the framework of differential privacy, a mathematical conception of privacy preservation. Differential privacy may help sharing statistical analyses and models trained on sensitive data without compromising privacy and without great loss of information.
Statisticians and data scientists often analyse data that contains sensitive information. Sharing analysis results, aggregate information, or models derived from such sensitive data may compromise individuals' privacy and results in accumulative risk. Nowadays, users can interact with machine learning tools and statistics dashboards that are constantly updated and automatically process vast amounts of data, such as, ubiquitous sensor data, detailed health records, private email and chat correspondences, and photo and video data shared on social media. This brings elevated risk and stresses ethical responsibility to safeguard privacy through privacy-preserving statistical data analysis.
MSc Programme in Statistics
- definition and interpretation of differential privacy
- privacy attacks on “de-identified” data and statistical data releases, such as, for example, re-identification, reconstruction, or membership attacks
- basic differentially-private algorithms (for example, Laplace mechanism)
- identifying and demonstrating risks to privacy in data science settings
- determining privacy guarantees after composition or post-processing
- presenting technical content in writing
- implement and present instructive examples of attacks on statistical data privacy
- implement privacy-preserving algorithms and experimentally validate their performance and utility
- effectively communicate and discuss technicalities of differential privacy and practical implications for data science applications
Lectures, in-class exercises, exercise classes and TA sessions for work on assignments and projects with written hand-ins, code notebook and report hand-ins, and student presentations.
The course literature will be announced on the Absalon course page.
Related text books are, for example,
The Algorithmic Foundations of Differential Privacy by C Dwork & A Roth
The Complexity of Differential Privacy by S Vadhan
(PDFs freely available on the authors' websites).
Students should have a solid grounding in probability and
statistics, linear algebra, vector calculus, and algorithms.
Students should be comfortable reading and writing mathematical
proofs involving algorithms and probability. Examples for courses:
StatMet/MStat/MatStat, LinAlgMat/LinAlgDat, Sand, ModComp, or
Academic qualifications equivalent to a BSc degree in a quantitative field is recommended (such as, for example, BSc in Mathematics/Actuarial Mathematics/Mathematics-Economics, BSc in Machine Learning and Data Science, BSc in Computer Science with suitable specialisation, or similar).
Students should be able to self-reliantly programme basic data simulations and analyses in Python, Julia, or R.
- 7,5 ECTS
- Type of assessment
- Type of assessment details
- The exam is composed of the following elements to be completed
during the course:
(1) a written summary of one assigned lecture,
(2) between 1 and 3 exercise assignments,
(3) a course project submitted as reproducible code notebook that combines the project report and its code implementations, and
(4) a presentation (of a solution to an assignment question or about the project).
All components need to be approved to pass the course. If a component is not passed, the student must take the re-exam.
Some exam elements are to be completed individually, others in groups of up to three students.
- All aids allowed
- Marking scale
- passed/not passed
- Censorship form
- No external censorship
One internal examiner
Criteria for exam assessment
Each of the exam elements must be approved separately to pass the course.
For an an element to be approved, the student must in a satisfactory way demonstrate that they have mastered the learning outcome of the course corresponding to that element.
Single subject courses (day)
- Project work
- Course number
- 7,5 ECTS
- Programme level
- Full Degree Master
- Block 2
The number of seats may be reduced in the late registration period.
- Study Board of Mathematics and Computer Science
- Department of Mathematical Sciences
- Faculty of Science
- Sebastian Weichwald (10-7d816f736d72816b766e4a776b7e7238757f386e75)
Are you BA- or KA-student?
Courseinformation of students