Introduction to Data Science (IDS)
Course content
The amount and complexity of available data are steadily
increasing. To make use of this wealth of information, computing
systems are needed that turn data into knowledge. Machine learning
is about developing the required software that automatically
analyses data for making predictions, categorisations, and
recommendations. Machine learning algorithms are already an
integral part of today's computing systems - for example in
search engines, recommender systems, or biometrical applications.
Machine learning provides a set of tools that is widely
applicable for data analysis within a diverse set of problem
domains such as data mining, search engines, digital image and
signal analysis, natural language modelling, bioinformatics,
physics, economics, biology, etc.
The purpose of the course is to introduce non-Computer
Science students to probabilistic data modelling and
the most common techniques from statistical machine learning and
data mining. The students will obtain a working knowledge of basic
data modelling and data analysis using fundamental machine learning
techniques.
This course is relevant for students from, among others, the
studies of Cognition and IT, Bioinformatics, Physics, Biology,
Chemistry, Economics, and Psychology.
The course covers the following tentative topic list:
- Foundations of statistical learning, probability theory.
- Classification methods, such as Linear models, K-Nearest Neighbor.
- Regression methods, such as Linear regression.
- Bayesian Statistics
- Clustering.
- Dimensionality reduction and visualisation techniques such as principal component analysis (PCA).
MSc Programme in Bioinformatics
MSc Programme in IT and Cognition
MSc Programme in Molecular Biomedicine
MSc Programme in Environmental Science
MSc Programme in Agriculture
MSc Climate Change
At course completion, the successful student will have:
Knowledge of
- the general principles of data analysis;
- elementary probability theory for modelling and analysing data;
- elementary Bayesian statistics;
- the basic concepts underlying classification, regression, and clustering;
- common pitfalls in machine learning.
Skills in
- applying linear and non-linear techniques for classification and regression;
- elementary data clustering;
- visualising and evaluating results obtained with machine learning techniques;
- identifying and handling common pitfalls in machine learning;
- using machine learning and data mining toolboxes.
Competences in
- recognising and describing possible applications of machine learning and data analysis in their field of science;
- comparing, appraising and selecting machine learning methods for specific tasks;
- solving real-world data mining and pattern recognition problems by using machine learning techniques.
Lecture and exercise classes
See Absalon when the course is set up. However, a brush-up in Calculus such as "Calculus for Dummies" could be of help for students who did not have mathematics since high school.
Basic calculus and programming knowledge is required. We use
Python as programming language. Students who have experience with
programming should be able to learn the necessary Python elements
with little difficulty.
Academic qualifications equivalent to a BSc degree is
recommended.
The courses NDAK16003U Introduction to Data Science (IDS) and NDAB15001U Modelling and Analysis of Data (MAD) have a very substantial overlap both in topics and level, and it is therefore not recommended that students pass both these courses.
As
an exchange, guest and credit student - click here!
Continuing Education - click here!
PhD’s can register for MSc-course by following the same procedure as credit-students, see link above.
- ECTS
- 7,5 ECTS
- Type of assessment
-
Continuous assessment
- Type of assessment details
- Assessment of 4-5 assignments weighted equally. Passed assignments cannot be transferred to another block. Assignments are individual.
- Aid
- All aids allowed
The use of Large Language Models (LLM)/Large Multimodal Models (LMM) – such as ChatGPT and GPT-4 – is permitted for the ordinary exam.
- Marking scale
- 7-point grading scale
- Censorship form
- No external censorship
Several internal examiners.
- Re-exam
-
A 20 minutes oral exam without preparation in course curriculum.
No aids allowed.
Criteria for exam assessment
See Learning Outcome.
Single subject courses (day)
- Category
- Hours
- Lectures
- 28
- Preparation
- 30
- Theory exercises
- 74
- Practical exercises
- 74
- English
- 206
Kursusinformation
- Language
- English
- Course number
- NDAK16003U
- ECTS
- 7,5 ECTS
- Programme level
- Full Degree Master
- Duration
-
1 block
- Placement
- Block 3
- Schedulegroup
-
A
- Capacity
- No limit
The number of seats may be reduced in the late registration period - Studyboard
- Study Board of Mathematics and Computer Science
Contracting department
- Department of Computer Science
Contracting faculty
- Faculty of Science
Course Coordinator
- Daniel Hershcovich (2-666a42666b306d7730666d)
Teacher
Daniel Hershcovich
Stella Frank
Are you BA- or KA-student?
Courseinformation of students