Introduction to Data Science (IDS)

Course content

The amount and complexity of available data are steadily increasing. To make use of this wealth of information, computing systems are needed that turn data into knowledge. Machine learning is about developing the required software that automatically analyses data for making predictions, categorisations, and recommendations. Machine learning algorithms are already an integral part of today's computing systems - for example in search engines, recommender systems, or biometrical applications. Machine learning provides a set of tools that is widely applicable for data analysis within a diverse set of problem domains such as data mining, search engines, digital image and signal analysis, natural language modelling, bioinformatics, physics, economics, biology, etc.

The purpose of the course is to introduce non-Computer Science students to probabilistic data modelling and the most common techniques from statistical machine learning and data mining. The students will obtain a working knowledge of basic data modelling and data analysis using fundamental machine learning techniques.

This course is relevant for students from, among others, the studies of Cognition and IT, Bioinformatics, Physics, Biology, Chemistry, Economics, and Psychology. 

The course covers the following tentative topic list:

  • Foundations of statistical learning, probability theory.
  • Classification methods, such as Linear models, K-Nearest Neighbor.
  • Regression methods, such as Linear regression.
  • Bayesian Statistics
  • Clustering.
  • Dimensionality reduction and visualisation techniques such as principal component analysis (PCA).
Education

MSc Programme in Bioinformatics

MSc Programme in IT and Cognition

MSc Programme in Molecular Biomedicine

MSc Programme in Environmental Science

MSc Programme in Agriculture

MSc Climate Change

Learning outcome

At course completion, the successful student will have:

Knowledge of

  • the general principles of data analysis;
  • elementary probability theory for modelling and analysing data;
  • elementary Bayesian statistics;
  • the basic concepts underlying classification, regression, and clustering;
  • common pitfalls in machine learning.

 

Skills in

  • applying linear and non-linear techniques for classification and regression;
  • elementary data clustering;
  • visualising and evaluating results obtained with machine learning techniques;
  • identifying and handling common pitfalls in machine learning;
  • using machine learning and data mining toolboxes.

 

Competences in

  • recognising and describing possible applications of machine learning and data analysis in their field of science;
  • comparing, appraising and selecting machine learning methods for specific tasks;
  • solving real-world data mining and pattern recognition problems by using machine learning techniques.

Lecture and exercise classes

See Absalon when the course is set up. However, a brush-up in Calculus such as "Calculus for Dummies" could be of help for students who did not have mathematics since high school.

Basic calculus and programming knowledge is required. We use Python as programming language. Students who have experience with programming should be able to learn the necessary Python elements with little difficulty.

Academic qualifications equivalent to a BSc degree is recommended.

Written
Individual
Continuous feedback during the course of the semester
ECTS
7,5 ECTS
Type of assessment
Continuous assessment
Type of assessment details
Assessment of 4-5 assignments weighted equally. Passed assignments cannot be transferred to another block. Assignments are individual.
Aid
All aids allowed

The use of AI assistance powered by Large Language Models (LLM)/Large Multimodal Models (LMM) – such as ChatGPT and GPT-4 – is permitted for the written assignments, under conditions that will be specified during the course.

Marking scale
7-point grading scale
Censorship form
No external censorship
Several internal examiners.
Re-exam

A 20 minutes oral exam without preparation in course curriculum.

No aids allowed.

Criteria for exam assessment

See Learning Outcome.

Single subject courses (day)

  • Category
  • Hours
  • Lectures
  • 28
  • Preparation
  • 30
  • Theory exercises
  • 74
  • Practical exercises
  • 74
  • English
  • 206

Kursusinformation

Language
English
Course number
NDAK16003U
ECTS
7,5 ECTS
Programme level
Full Degree Master
Duration

1 block

Placement
Block 3
Schedulegroup
A
Capacity
No limitation – unless you register in the late-registration period (BSc and MSc) or as a credit or single subject student.
Studyboard
Study Board of Mathematics and Computer Science
Contracting department
  • Department of Computer Science
Contracting faculty
  • Faculty of Science
Course Coordinator
  • Daniel Hershcovich   (2-71754d71763b78823b7178)
Saved on the 15-02-2024

Are you BA- or KA-student?

Are you bachelor- or kandidat-student, then find the course in the course catalog for students:

Courseinformation of students