Machine Learning A (MLA)
Course content
The course introduces basic theory and algorithms of machine learning. The course covers the following tentative list of topics:
- Supervised learning setting
- Classification
- Regression
- Unsupervised learning setting
- Concentration of measure inequalities
- Markov's
- Chebyshev's
- Hoeffding's
- Analysis of generalization in classification
- Validation and cross-validation
- Generalisation bound for a single hypothesis
- Generalisation bound for a finite hypothesis class
- Occam's razor - generalisation bound for a countably infinite hypothesis class
- Algorithms
- K-Nearest Neighbors
- Perceptron
- Logistic Regression
- Linear Regression
- Feature transformations and classification/regression in transformed feature spaces
- Various forms of regularisation
- Regularization terms
- Dimensionality reduction
- Random Forests and Decision Trees
- Neural Networks and introduction to Deep Learning
- Principal Component Analysis (PCA)
- Clustering
- Assumptions behind the algorithms taught in the course, their
implications, and common pitfalls
- Overfitting
- Internal overfitting within algorithms due to overly complex hypothesis spaces
- Extrenal overfitting outside algorithms due to application of an excessive number of algorithms to a dataset
- The i.i.d. assumption
- The i.i.d. assumption is behind everything taught in the course
- Consequences of violation of the i.i.d. assumption
- Special case: sampling bias
- Failure of generalisation guarantees
- Implications of the i.i.d. assumption
- Biases in the training data propagate into predictions
- Correlation ≠ Causality
- The course only studies statistical correlations / dependencies in the data. Causal inference is not covered in the course.
- Overfitting
WARNING: The course assumes solid math and programming skills. Please, check the "Recommended Academic Qualifications" box below and the self-assessment assignment at https://sites.google.com/diku.edu/machine-learning-courses/ml-a. It is not advised taking the course if you do not meet the academic qualifications.
Physical & Online: This is a physical course, but we support remote participation.
BSc Programme in Cognitive Data Science
BSc Programme in Computer Science (data science track
BSc Programme in Machine Learning and Data Science
MSc Programme in Computer Science
MSc Programme in Computer Science (part time)
MSc Programme in Computer Science (with minor subject)
MSc Programme in Actuarial Mathematics
MSc Programme in Bioinformatics
MSc Programme in Mathematics-Economics
MSc Programme in Statistics
At course completion, the successful student will have:
Knowledge of
- the basic principles of machine learning;
- basic probability theory for modelling and analysing data;
- the theoretical concepts underlying classification, regression, and clustering;
- the mathematical foundations of selected machine learning algorithms;
- basic assumptions behind the algorithms studied in the course, their implications and common pitfalls.
Skills in
- proving generalisation bounds based on validation errors;
- proving generalisation bounds for countable hypothesis classes;
- applying linear and non-linear techniques for classification and regression;
- performing elementary dimensionality reduction;
- elementary data clustering;
- implementing selected machine learning algorithms;
- visualising and evaluating results obtained with machine learning techniques;
- using software libraries for solving machine learning problems;
- identifying and handling common pitfalls in machine learning.
Competences in
- recognising and describing possible applications of machine learning;
- formalising and rigorously analysing machine learning problems;
- comparing, appraising and selecting machine learning methods for specific tasks;
- solving real-world data mining and pattern recognition problems by using machine learning techniques.
Weekly lectures, weekly home assignments, exercise sessions
Will be published on Absalon.
1. Knowledge of Linear Algebra corresponding to Lineær algebra i
datalogi course (LinAlgDat)
2. Knowledge of Calculus corresponding to Introduktion til
matematik i naturvidenskab (MatintroNat) or Matematisk analyse og
sandsynlighedsteori i datalogi (MASD).
3.Knowledge of Probability Theory corresponding to
Sandsynligheds-regning og statistik (SS), Grundlæggende statistik
og sandsynlighedsregning (GSS) or Matematisk analyse og
sandsynlighedsteori i datalogi (MASD) and Modelling analysis of
data (MAD).
4.Knowledge of Discrete Mathematics corresponding to Diskret
matematik og formelle sprog (DMFS) or Diskret Matematik og
algoritmer (DMA).
5. Knowledge of programming corresponding to Programmering og
problemløsning (PoP) and experience with programming in Python.
You can test your skills by solving the self-assessment assignment
at
https://sites.google.com/diku.edu/machine-learning-courses/ml-a.
The course is identical to approximately 50% of NDAB20000U
Introduktion til Machine Learning (IntroML)
It is not allowed to pass both this course and the Introduktion til
Machine Learning (IntroML).
The course is similar to NDAB21005U Machine Learning A (MLA) and it
is therefore not allowed to pass both courses.
- ECTS
- 7,5 ECTS
- Type of assessment
-
Written assignment, 7 days
- Type of assessment details
- The exam is a 7-day written take-home assignment (must be
solved individually).
The exam will be handed out Friday in block week 7 and must be handed in the following Friday.
*Please note: that the planned exam workload is 25 hours. We provide extra days to allow the students to combine the exam with other potential duties, such as other exams or work commitments. - Aid
- All aids allowed
- Marking scale
- 7-point grading scale
- Censorship form
- External censorship
Criteria for exam assessment
See Learning Outcome.
Single subject courses (day)
- Category
- Hours
- Lectures
- 34
- Preparation
- 8
- Theory exercises
- 57
- Practical exercises
- 57
- Exam Preparation
- 25
- Exam
- 25
- English
- 206
Kursusinformation
- Language
- English
- Course number
- NDAK22000U
- ECTS
- 7,5 ECTS
- Programme level
- Full Degree Master
- Duration
-
1 block
- Placement
- Block 1
- Schedulegroup
-
B
- Capacity
- No limit
The number of seats may be reduced in the late registration period - Studyboard
- Study Board of Mathematics and Computer Science
Contracting department
- Department of Computer Science
Contracting faculty
- Faculty of Science
Course Coordinator
- Sadegh Talebi (7-7132776c656c6d44686d326f7932686f)
Teacher
Yevgeny Seldin, Christian Igel & Sadegh Talebi
Are you BA- or KA-student?
Courseinformation of students