Target-Specific Refinement of Multigrid Codes

This paper applies partial evaluation to stage a stencil code Domain-Specific Language (DSL) onto a functional and imperative programming language. Platform-specific primitives such as scheduling or vectorization, and algorithmic variants such as boundary handling are factored out into a library that make up the elements of that DSL. We show how partial evaluation can eliminate all overhead of this separation of concerns and creates code that resembles hand-crafted versions for a particular target platform. We evaluate our technique by implementing a DSL for the V-cycle multigrid iteration. Our approach generates code for AMD and NVIDIA GPUs (via SPIR and NVVM) as well as for CPUs using AVX/AVX2 alike from the same high-level DSL program. First results show that we achieve a speedup of up to 3X on the CPU by vectorizing multigrid components and a speedup of up to 2X on the GPU by merging the computation of multigrid components.

BibTeX
@inproceedings{membarth2014refinement,
  author       = {Membarth, Richard and Slusallek, Philipp and Köster, Marcel and Leißa, Roland and Hack, Sebastian},
  address      = {New Orleans, LA, USA},
  booktitle    = {Proceedings of the 4th International Workshop on Domain-Specific Languages and High-Level Frameworks for High Performance Computing (WOLFHPC)},
  title        = {{Target-Specific Refinement of Multigrid Codes}},
  pages        = {52--57},
  year         = 2014,
  month        = nov,
  date         = {2014-11-17},
  doi          = {10.1109/WOLFHPC.2014.5},
  organization = {IEEE}
}