Programming Massively Parallel Hardware (PMPH)
Course content
In simple words, the aim of the course is to teach students how
to write programs that run fast on highly-parallel hardware, such
as general-purpose graphics processing units (GPGPUs), which are
now mainstream. Such architectures are however capricious;
unlocking their power requires understanding their design
principles and also specialised knowledge of code transformations,
for example, aimed at optimising locality of reference, the degree
of parallelism, etc. As such, this course is organised
into three tracks: hardware, software, and lab.
The Software Track teaches how to think parallel. We introduce the
map-reduce functional programming model, which builds programs
naturally, like puzzles, from a nested composition of
implicitly-parallel array operators, which are rooted in the
mathematical structure of list homomorphisms. We reason about the
asymptotic (work and depth) properties of such programs and discuss
the flattening transformation, which converts (all)
arbitrarily-nested parallelism to a more-restricted form that can
be directly mapped to the hardware. We then turn our attention
to legacy-sequential code written in programming languages such as
C. In this context we study dependence analysis, as a tool for
reasoning about loop-based optimisations (e.g., Is it safe to
execute a given loop in parallel, or to interchange two loops?). As
time permits, we may cover more advanced topics, for example,
related to dynamic analysis for optimising locality of
reference.
The Hardware Track studies the design space of the critical components of parallel hardware: processor, memory hierarchy and interconnect networks. We will find out that modern hardware design is governed by old ideas, which are merely adjusted or combined in different ways.
The Lab Track applies the theory learned in the other tracks. We will review the fundamental ideas that govern the GPGPU design and potential performance bottlenecks. We will quickly learn several parallel-programming models, and we will get our hands dirty by putting in practice the optimisations learned in the software track. We will use (the in-house developed) Futhark to write nested-parallel programs, to demonstrate flattening, and as a baseline. We will use OpenMP and CUDA to write "parallel-assembly" code for multi-core and GPGPU execution, respectively.
MSc Programme in Bioinformatics
MSc Programme in Computer Science
Knowledge of
- The types and semantics of data-parallel operators.
- Analyses for identifying and optimising parallelism and locality of reference, e.g., flattening, dependence analysis.
- The main hardware-design techniques for supporting parallelism at processor, memory hierarchy and interconnect levels.
Skills in
- Implementing parallel programs in high-level (Futhark) and lower-level programming models (OpenMP, CUDA).
- Applying (by hand) the flattening transformation on specific instances of data-parallel programs.
- Applying (by hand) various "imperative" code transformations (such as loop interchange, loop distribution, block and register tiling) for optimising the degree of parallelism and locality of reference.
- Testing, measuring the impact of applied optimisations and characterising the performance of parallel programs.
Competences in
- Reasoning about the work-depth asymptotic behaviour of specific instances of data-parallel programs.
- Reasoning based on dependence analysis about the (in)correctness of specific instances of loop parallelisation and related optimisations.
- Identifying an effective parallelisation solution for a given application.
Lecture, labs, in-class exercises, individual weekly assignments, group project.
The topics taught in the hardware track are selected from the
book "Parallel Computer Organization and Design'', by
Michel Dubois, Murali Annavaram and Per
Stenstrom, Cambridge University Press, the latest
edition.
Buying the hardware book is highly recommended.
Lecture notes covering the material on the software track will be provided on Absalon. Various other related material, such as scientific articles and tutorials (e.g., Futhark, CUDA) will be pointed out from the course pages.
The course syllabus assumes knowledge of hardware architecture,
programming languages, compilers, data-structures and algorithms,
linear algebra, and most importantly programming competences in
C/C++ (and basic knowledge of F#/Haskell would be great). For
example, at DIKU, these can be acquired through the corresponding
BSc courses (or through self-study).
Academic qualifications equivalent to a BSc degree is
recommended.
DISCLAIMER: The course ambitiously aims to cover a lot of
theoretical and practical ground in a relatively short amount of
time. The course is designed around the assumption that students
will attend the vast majority of lectures and labs and are not shy
to ask questions during them, i.e., "help will be provided at
DIKU to those who ask for it".
If the time schedule of this course conflicts with your work
schedule or with another course, we strongly recommend that you do
NOT take this course.
As an exchange, guest and credit student - click here!
Continuing Education - click here!
PhD’s can register for MSc-course by following the same procedure as credit-students, see link above.
- ECTS
- 7,5 ECTS
- Type of assessment
-
Continuous assessment
- Type of assessment details
- 3-to-4 individual assignments (40%), group project (report)
with individual presentation and a short oral examination (60%).
The oral examination is in continuation of the individual presentation and consists of questions related to the report and/or course material (10 min presentation + up to 20 min oral examination).
No aids are allowed for the oral examination. - Aid
- All aids allowed
- Marking scale
- 7-point grading scale
- Censorship form
- No external censorship
Several internal examiners
- Re-exam
-
Resubmission of the assignments (35%) and the project extended with additional tasks (40%), and a 30 minutes oral examination (25%) without preparation.
No aids are allowed for the oral examination.
Already passed assignments/report will be considered.
Criteria for exam assessment
See Learning Outcome
Single subject courses (day)
- Category
- Hours
- Lectures
- 28
- Preparation
- 15
- Exercises
- 67
- Laboratory
- 28
- Project work
- 67
- Exam
- 1
- English
- 206
Kursusinformation
- Language
- English
- Course number
- NDAK14008U
- ECTS
- 7,5 ECTS
- Programme level
- Full Degree Master
- Duration
-
1 block
- Placement
- Block 1
- Schedulegroup
-
C
- Capacity
- No limitation – unless you register in the late-registration period (BSc and MSc) or as a credit or single subject student.
- Studyboard
- Study Board of Mathematics and Computer Science
Contracting department
- Department of Computer Science
Contracting faculty
- Faculty of Science
Course Coordinator
- Cosmin Eugen Oancea (13-677377716d723273657267696544686d326f7932686f)
Teacher
Cosmin E. Oancea
Additional maybe: Troels Henriksen
Er du BA- eller KA-studerende?
Kursusinformation for indskrevne studerende