This course provides an introduction to CUDA and programming parallel hardware architectures like todays GPUs. We will show how to program with CUDA and what problems can be solved efficiently with modern GPUs. The discussed algorithms are not necessarily related to Computer Graphics. The course will be accompanied by practical exercises and the students will have to work on a small project to pass.
The format of the course will change mid-way through the term. Two-hour lectures and one-hour tutorials will be replaced by practical work on larger projects. The course focuses entirely on parallel programming on modern GPUs. CUDA will be used to implement all practical assignments which will include common parallel primitives like parallel prefix sum, parallel reduction, and parallel sorting algorithms (e.g. radix sort). In addition to the training material available from NVIDIA and other sources, we will also use some of the recent scientific papers for up-to-date results and programming methods.
Register for the course via Microsoft Teams.
- Programming experience with C++
|2020-11-03||Introduction, Evolution of GPU Programming|
|2020-11-10||The CUDA Programming Model|
|2020-11-12||Assignment 1, Q&A|
|2020-11-17||The CUDA API|
|2020-11-24||Parallel Programming Patterns||A 1|
|2020-11-26||Development Environment, Q&A|
|2020-12-01||Memory Hierarchy||A 2|
|2020-12-03||Assignment 2, Toolchain, Q&A|
|2020-12-08||Performance Optimization Case Study||A 3|
|2020-12-10||Assignment 3, Debugging, Q&A|
|2020-12-15||Hardware Scheduling||A 4|
|2020-12-17||Assignment 4, Q&A|
|2021-01-12||Advanced Programming Techniques|
|2021-01-19||Related Programming Models|
|2021-01-26||AnyDSL Compiler Framework|
|2021-02-04||Exam Preparation, Q&A|
The assignments will be posted in Microsoft Teams.
The projects are expected to compile and work out of the box on the machines in the CIP-pool students’ lab in order to give the tutors a guaranty that the code will run on machines that both them and the students have access to.
The written exam will take place on 16.02.2021 in GHH.
10% Performance Competition
The Performance Competition is a final showcase of how the GPU knowledge that was acquired throughout the course can be used to accelerate the implementation of the last assignment. The top-performing submissions will be awarded with bonus points.
The course does not follow a particular book, but suggested readings include:
- David Kirk and Wen-Mei Hwu, Programming Massively Parallel Processors: A Hands-on Approach, 3rd Edition, Morgan Kaufmann, 2016
- Jason Sanders and Edward Kandrot, CUDA by Example: An Introduction to General-Purpose GPU Programming, 1st Edition, Addison-Wesley, 2011
- Duane Storti and Mete Yurtoglu, CUDA for Engineers: An Introduction to High-Performance Parallel Computing, 1st Edition, Addison-Wesley, 2016
- John Cheng, Max Grossman, and Ty McKercher, Professional CUDA C Programming, 1st Edition, Wrox, 2014
- Nicholas Wilt, The CUDA Handbook, 1st Edition, Addison-Wesley, 2013
- Shane Cook, CUDA Programming, 1st Edition, Morgan Kaufmann, 2012
- CUDA C Programming Guide, available online