Dr.-Ing. Richard Membarth Senior Researcher
Now Professor for System on a Chip and AI for Edge Computing at the Technische Hochschule Ingolstadt.
Research Interests
gpu computing parallel computing domain-specific languages compilersTeaching
Projects

Professional Scientific Activities
-
High-Performance Graphics 2020
Papers Chair High-Performance Graphics is the leading international forum for performance-oriented graphics systems research including innovative algorithms, efficient implementations, and hardware architecture.
-
High-Performance Graphics 2019
General Chair High-Performance Graphics is the leading international forum for performance-oriented graphics systems research including innovative algorithms, efficient implementations, and hardware architecture.
-
High-Performance Graphics 2018, 2017
Publicity Chair High-Performance Graphics is the leading international forum for performance-oriented graphics systems research including innovative algorithms, efficient implementations, and hardware architecture.
-
Performance Portability in Extreme Scale Computing Metrics, Challenges, Solutions
Dagstuhl Seminar 17431 Performance Portability is a critical new challenge in extreme-scale computing. In essence, performance-portable applications can be efficiently executed on a wide variety of HPC architectures without significant manual modifications.
-
European Network on High Performance and Embedded Architecture and Compilation
HiPEAC Member HiPEAC is a European network of almost 2,000 world-class computing systems researchers, industry representatives and students.
Program Committee Member

International Conference on Cluster Computing
CLUSTERThe IEEE Cluster Conference serves as a major international forum for presenting and sharing recent accomplishments and technological developments in the field of cluster computing as well as the use of cluster systems for scientific and commercial application.
Read More
International Conference on Parallel Processing
ICPPThe organizers of ICPP 2020 aim to ofer an open, inclusive, environment for the exchange of ideas and to foster the creation of connections and research partnerships amongst participants from all backgrounds. Improper conduct and/or harassment will not be tolerated..
Read More
International Supercomputing Conference
ISCThe ISC Exhibition, consisting of over 160 exhibitors, caters to the hardware and software demands of global research centers and businesses in the fields of HPC, artificial intelligence, machine learning and data analytics..
Read More
International Workshop on OpenCL
IWOCLThe International Workshop on OpenCL (IWOCL) is an annual meeting of OpenCL users, researchers, developers and suppliers to share OpenCL best practise, and to promote the evolution and advancement of the OpenCL standard. The meeting is open to anyone who is interested in contributing to, and participating in the OpenCL community..
Read More
International Workshop on Heterogeneous and Unconventional Cluster Architectures and Applications
HUCAAThe workshop on Heterogeneous and Unconventional Cluster Architectures and Applications gears to gather recent work on heterogeneous and unconventional cluster architectures and applications, which might have an impact on future mainstream cluster architectures..
Read More
High-Performance Graphics
HPGHigh-Performance Graphics is the leading international forum for performance-oriented graphics systems research including innovative algorithms, efficient implementations, and hardware architecture..
Read More
Workshop on General Purpose Processing Using GPUs
GPGPUThe goal of this workshop is to provide a forum to discuss new and emerging general-purpose purpose programming environments and platforms, as well as evaluate applications that have been able to harness the horsepower provided by these platforms..
Read More
Workshop on Architectures and Systems for Real-Time Mobile Vision Applications
ASR-MOVThe ASR-MOV workshop brings together system researchers to discuss how the requirements of real-time mobile vision applications impact on tools, architectures and systems..
Read More
Workshop on Heterogeneous Architectures and Design Methods for Embedded Image Systems
HISHeterogeneous Architectures and Design Methods for Embedded Image Systems HIS 2015 co-located with Conference on Design, Automation and Test in Europe..
Read MorePublications
AnySeq/GPU: A Novel Approach for Faster Sequence Alignment on GPUs
@inproceedings{mueller2022anyseqgpu, author = {Müller, André and Schmidt, Bertil and Membarth, Richard and Leißa, Roland and Hack, Sebastian}, address = {Virtual Event}, booktitle = {Proceedings of the 36th ACM International Conference on Supercomputing (ICS)}, title = {{AnySeq/GPU}: A Novel Approach for Faster Sequence Alignment on {GPUs}}, pages = {20:1--20:11}, year = 2022, month = jun, date = {2022-06-27/2022-06-30}, doi = {10.1145/3524059.3532376}, organization = {ACM} }
XEngine: Optimal Tensor Rematerialization for Neural Networks in Heterogeneous Environments
@article{schuler2022xengine, author = {Schuler, Manuela and Membarth, Richard and Slusallek, Philipp}, title = {{XEngine}: Optimal Tensor Rematerialization for Neural Networks in Heterogeneous Environments}, year = {2022}, issue_date = {March 2023}, publisher = {Association for Computing Machinery}, address = {New York, NY, USA}, volume = {20}, number = {1}, issn = {1544-3566}, url = {https://doi.org/10.1145/3568956}, doi = {10.1145/3568956}, journal = {ACM Transactions on Architecture and Code Optimization (TACO)}, month = {dec}, articleno = {17}, numpages = {25}, keywords = {Rematerialization, heterogeneous computing, memory management, neural networks, integer linear programming} }
FLOWER: A Comprehensive Dataflow Compiler for High-Level Synthesis
@inproceedings{amiri2021flower, author = {Amiri, Puya and Pérard-Gayot, Arsène and Membarth, Richard and Slusallek, Philipp and Leißa, Roland and Hack, Sebastian}, address = {Auckland, New Zealand}, booktitle = {Proceedings of the 2021 International Conference on Field-Programmable Technology (FPT)}, title = {{FLOWER}: A Comprehensive Dataflow Compiler for High-Level Synthesis}, pages = {1--9}, %year = 2021, %month = dec, date = {2021-12-06/2021-12-10}, doi = {10.1109/ICFPT52863.2021.9609930}, organization = {IEEE} }
tinyMD: Mapping Molecular Dynamics Simulations to Heterogeneous Hardware using Partial Evaluation
@article{ravedutti2021tinymd, author = {Ravedutti Lucio Machado, Rafael and Schmitt, Jonas and Eibl, Sebastian and Eitzinger, Jan and Leißa, Roland and Hack, Sebastian and Pérard-Gayot, Arsène and Membarth, Richard and Köstler, Harald}, title = {{tinyMD}: Mapping Molecular Dynamics Simulations to Heterogeneous Hardware using Partial Evaluation}, journal = {Journal of Computational Science (JOCS)}, pages = {1--11}, volume = {54}, number = {101425}, year = 2021, month = jul, date = {2021-07-10}, doi = {10.1016/j.jocs.2021.101425}, publisher = {Elsevier} }
AnySeq: A High Performance Sequence Alignment Library based on Partial Evaluation
@inproceedings{mueller2020anyseq, author = {Müller, André and Schmidt, Bertil and Hildebrandt, Andreas and Membarth, Richard and Leißa, Roland and Kruse, Matthis and Hack, Sebastian}, address = {New Orleans, LA, USA}, booktitle = {Proceedings of the 34th IEEE International Parallel \& Distributed Processing Symposium (IPDPS)}, title = {{AnySeq}: A High Performance Sequence Alignment Library based on Partial Evaluation}, pages = {1030--1040}, year = 2020, month = may, date = {2020-05-18/2020-05-22}, doi = {10.1109/IPDPS47924.2020.00109}, organization = {IEEE} }
AnyHLS: High-Level Synthesis with Partial Evaluation
@article{oezkan2020anyhls, author = {Özkan, M. Akif and Pérard-Gayot, Arsène and Membarth, Richard and Slusallek, Philipp and Leißa, Roland and Hack, Sebastian and Teich, Jürgen and Hannig, Frank}, title = {{AnyHLS}: High-Level Synthesis with Partial Evaluation}, journal = {IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD) (Proceedings of CODES+ISSS 2020)}, pages = {3202--3214}, volume = {39}, number = {11}, year = 2020, month = sep, date = {2020-09-20/2020-09-25}, doi = {10.1109/TCAD.2020.3012172}, publisher = {IEEE} }
Efficient Mapping of Streaming Applications for Image Processing on Graphics Cards
@article{membarth2019efficientmapping, author = {Membarth, Richard and Dutta, Hritam and Hannig, Frank and Teich, Jürgen}, title = {Efficient Mapping of Streaming Applications for Image Processing on Graphics Cards}, journal = {Transactions on High-Performance Embedded Architectures and Compilers (Transactions on HiPEAC)}, pages = {1--20}, volume = {V}, %year = 2019, %month = feb, date = {2019-02}, doi = {10.1007/978-3-662-58834-5_1}, publisher = {Springer} }
Rodent: Generating Renderers without Writing a Generator
@article{perard2019rodent, author = {Pérard-Gayot, Arsène and Membarth, Richard and Leißa, Roland and Hack, Sebastian and Slusallek, Philipp}, title = {Rodent: Generating Renderers without Writing a Generator}, journal = {ACM Transactions on Graphics (TOG) (Proceedings of SIGGRAPH 2019)}, pages = {40:1--40:12}, volume = {38}, number = {4}, year = 2019, month = jul, date = {2019-07-28/2019-08-01}, doi = {10.1145/3306346.3322955}, publisher = {ACM} }
Parallel Multi-Hypothesis Algorithm for Criticality Estimation in Traffic and Collision Avoidance
@inproceedings{sanchezmorales2019parallelmultihypothesis, author = {{Sánchez Morales}, Eduardo and Membarth, Richard and Gaull, Andreas and Slusallek, Philipp and Dirndorfer, Tobias and Kammenhuber, Alexander and Lauer, Christoph and Botsch, Michael}, address = {Paris, France}, booktitle = {Proceedings of the 30th IEEE Intelligent Vehicles Symposium (IV)}, title = {Parallel Multi-Hypothesis Algorithm for Criticality Estimation in Traffic and Collision Avoidance}, pages = {2164--2171}, year = 2019, month = jun, date = {2019-06-09/2019-06-12}, doi = {10.1109/IVS.2019.8814015}, organization = {IEEE} }
AnyDSL: A Partial Evaluation Framework for Programming High-Performance Libraries
Proceedings of the ACM on Programming Languages (PACMPL), 2(OOPSLA): 119:1-119:30, 2018
@article{leissa2018anydsl, author = {Leißa, Roland and Boesche, Klaas and Hack, Sebastian and Pérard-Gayot, Arsène and Membarth, Richard and Slusallek, Philipp and Müller, André and Schmidt, Bertil}, title = {{AnyDSL}: A Partial Evaluation Framework for Programming High-Performance Libraries}, journal = {Proceedings of the ACM on Programming Languages (PACMPL)}, pages = {119:1--119:30}, volume = {2}, number = {OOPSLA}, %year = 2018, %month = nov, date = {2018-11-04/2018-11-09}, note = {{HiPEAC 2018 Paper Award}}, doi = {10.1145/3276489}, publisher = {ACM} }
A Journey into DSL Design using Generative Programming: FPGA Mapping of Image Border Handling through Refinement
@inproceedings{oezkan2018fpgaborderhandling, author = {Özkan, Mehmet Akif and Pérard-Gayot, Arsène and Membarth, Richard and Slusallek, Philipp and Teich, Jürgen and Hannig, Frank}, address = {Dublin, Ireland}, booktitle = {Proceedings of the Fifth International Workshop on FPGAs for Software Programmers (FSP)}, title = {{A Journey into DSL Design using Generative Programming: FPGA Mapping of Image Border Handling through Refinement}}, pages = {1--9}, year = 2018, month = aug, date = {2018-08-31}, organization = {VDE} }
A Data Layout Transformation for Vectorizing Compilers
@inproceedings{perard2018splitalloca, author = {Pérard-Gayot, Arsène and Membarth, Richard and Slusallek, Philipp and Moll, Simon and Leißa, Roland and Hack, Sebastian}, address = {Vösendorf / Vienna, Austria}, booktitle = {Proceedings of the 2018 Workshop on Programming Models for SIMD/Vector Processing (WPMVP)}, title = {{A Data Layout Transformation for Vectorizing Compilers}}, pages = {7:1--7:8}, year = 2018, month = feb, date = {2018-02-24}, doi = {10.1145/3178433.3178440}, organization = {ACM} }
Unified Code Generation for the Parallel Computation of Pairwise Interactions using Partial Evaluation
@inproceedings{schmitt2018unifiedmd, author = {Schmitt, Jonas and Köstler, Harald and Eitzinger, Jan and Membarth, Richard}, address = {Geneva, Switzerland}, booktitle = {Proceedings of the 17th International Symposium on Parallel and Distributed Computing (ISPDC)}, title = {{Unified Code Generation for the Parallel Computation of Pairwise Interactions using Partial Evaluation}}, pages = {17--24}, year = 2018, month = jun, date = {2018-06-25/2018-06-28}, doi = {10.1109/ISPDC2018.2018.00012}, organization = {IEEE} }
RaTrace: Simple and Efficient Abstractions for BVH Ray Traversal Algorithms
@inproceedings{perard2017ratrace, author = {Pérard-Gayot, Arsène and Weier, Martin and Membarth, Richard and Slusallek, Philipp and Leißa, Roland and Hack, Sebastian}, address = {Vancouver, BC, Canada}, booktitle = {Proceedings of the 16th International Conference on Generative Programming: Concepts \& Experiences (GPCE)}, title = {{RaTrace: Simple and Efficient Abstractions for BVH Ray Traversal Algorithms}}, pages = {157--168}, year = 2017, month = oct, date = {2017-10-23/2017-10-24}, doi = {10.1145/3136040.3136044}, organization = {ACM} }
The Next Generation of In-home Streaming: Light Fields, 5K, 10 GbE, and Foveated Compression
Proceedings of the 10th International Symposium on Multimedia Applications and Processing (MMAP), pp. 663-667, Prague, Czech Republic, September 3-6, 2017
@inproceedings{pohl2017nextgeneration, author = {Pohl, Daniel and Jungmann, Daniel and Taudul, Bartosz and Membarth, Richard and Hariharan, Harini and Herfet, Thorsten and Grau, Oliver}, address = {Prague, Czech Republic}, booktitle = {Proceedings of the 10th International Symposium on Multimedia Applications and Processing (MMAP)}, title = {{The Next Generation of In-home Streaming: Light Fields, 5K, 10 GbE, and Foveated Compression}}, pages = {663--667}, year = 2017, month = sep, date = {2017-09-03/2017-09-06}, note = {{Best Paper Award}}, doi = {10.15439/2017F16}, organization = {IEEE} }
Generating FPGA-based Image Processing Accelerators with Hipacc
Proceedings of the International Conference On Computer Aided Design (ICCAD), pp. 1026-1033, Irvine, CA, USA, November 13-16, 2017
@inproceedings{reiche2017hipaccfpga, author = {Reiche, Oliver and Özkan, Mehmet Akif and Membarth, Richard and Teich, Jürgen and Hannig, Frank}, address = {Irvine, CA, USA}, booktitle = {Proceedings of the International Conference On Computer Aided Design (ICCAD)}, title = {{Generating FPGA-based Image Processing Accelerators with Hipacc}}, pages = {1026--1033}, year = 2017, month = nov, date = {2017-11-13/2017-11-16}, note = {{Invited Paper}}, doi = {10.1109/ICCAD.2017.8203894}, organization = {IEEE} }
Hipacc: A Domain-Specific Language and Compiler for Image Processing
@article{membarth2016hipacc, author = {Membarth, Richard and Reiche, Oliver and Hannig, Frank and Teich, Jürgen and Körner, Mario and Eckert, Wieland}, title = {{Hipacc: A Domain-Specific Language and Compiler for Image Processing}}, journal = {Transactions on Parallel and Distributed Systems (TPDS)}, pages = {210--224}, volume = {27}, number = {1}, year = 2016, month = jan, date = {2016-01-01}, doi = {10.1109/TPDS.2015.2394802}, publisher = {IEEE}, }
Shallow Embedding of DSLs via Online Partial Evaluation
Proceedings of the 14th International Conference on Generative Programming: Concepts & Experiences (GPCE), pp. 11-20, Pittsburgh, PA, USA, October 26-27, 2015
@inproceedings{leissa2015shallow, author = {Leißa, Roland and Boesche, Klaas and Hack, Sebastian and Membarth, Richard and Slusallek, Philipp}, address = {Pittsburgh, PA, USA}, booktitle = {Proceedings of the 14th International Conference on Generative Programming: Concepts \& Experiences (GPCE)}, title = {{Shallow Embedding of DSLs via Online Partial Evaluation}}, pages = {11--20}, year = 2015, month = oct, note = {{Best Paper Award}}, date = {2015-10-26/2015-10-27}, doi = {10.1145/2814204.2814208}, organization = {ACM} }
Advanced In-home Streaming to Mobile Devices and Wearables
@article{pohl2015inhomestreaming, author = {Pohl, Daniel and Taudul, Bartosz and Membarth, Richard and Nickels, Stefan and Grau, Oliver}, title = {{Advanced In-home Streaming to Mobile Devices and Wearables}}, journal = {International Journal of Computer Science \& Applications (IJCSA)}, pages = {20--36}, volume = {12}, number = {2}, year = 2015, month = aug, date = {2015-08}, issn = {0972-9038}, publisher = {Technomathematics Research Foundation}, }
Specialization through Dynamic Staging
@inproceedings{danilewski2014specialization, author = {Danilewski, Piotr and Köster, Marcel and Leißa, Roland and Membarth, Richard and Slusallek, Philipp}, address = {Västerås, Sweden}, booktitle = {Proceedings of the 13th International Conference on Generative Programming: Concepts \& Experiences (GPCE)}, title = {{Specialization through Dynamic Staging}}, pages = {103--112}, year = 2014, month = sep, date = {2014-09-15/2014-09-16}, }
Platform-Specific Optimization and Mapping of Stencil Codes through Refinement
@inproceedings{koester2014platformhistencils, author = {Köster, Marcel and Leißa, Roland and Hack, Sebastian and Membarth, Richard and Slusallek, Philipp}, address = {Vienna, Austria}, booktitle = {Proceedings of the 1st International Workshop on High-Performance Stencil Computations (HiStencils)}, title = {{Platform-Specific Optimization and Mapping of Stencil Codes through Refinement}}, pages = {1--6}, date = {2014-01-21}, }
Code Refinement of Stencil Codes
@article{koester2014platformppl, author = {Köster, Marcel and Leißa, Roland and Hack, Sebastian and Membarth, Richard and Slusallek, Philipp}, title = {{Code Refinement of Stencil Codes}}, journal = {Parallel Processing Letters (PPL)}, pages = {1--16}, volume = {24}, number = {3}, year = 2014, month = sep, date = {2014-09}, doi = {10.1142/S0129626414410035}, publisher = {World Scientific} }
Code Generation for Embedded Heterogeneous Architectures on Android
@inproceedings{membarth2014android, author = {Membarth, Richard and Reiche, Oliver and Hannig, Frank and Teich, Jürgen}, address = {Dresden, Germany}, booktitle = {Proceedings of the Conference on Design, Automation and Test in Europe (DATE)}, title = {{Code Generation for Embedded Heterogeneous Architectures on Android}}, pages = {86:1--86:6}, year = 2014, month = mar, date = {2014-03-24/2014-03-28}, doi = {10.7873/DATE.2014.099}, organization = {IEEE} }
High-Performance Domain-Specific Languages for GPU Computing
Towards a Performance-portable Description of Geometric Multigrid Algorithms using a Domain-specific Language
@article{membarth2014towards, author = {Membarth, Richard and Reiche, Oliver and Schmitt, Christian and Hannig, Frank and Teich, Jürgen and Stürmer, Markus and Köstler, Harald}, title = {{Towards a Performance-portable Description of Geometric Multigrid Algorithms using a Domain-specific Language}}, journal = {Journal of Parallel and Distributed Computing (JPDC)}, pages = {3191--3201}, volume = {74}, number = {12}, year = 2014, month = dec, date = {2014-12}, doi = {10.1016/j.jpdc.2014.08.008}, publisher = {Elsevier} }
Target-Specific Refinement of Multigrid Codes
@inproceedings{membarth2014refinement, author = {Membarth, Richard and Slusallek, Philipp and Köster, Marcel and Leißa, Roland and Hack, Sebastian}, address = {New Orleans, LA, USA}, booktitle = {Proceedings of the 4th International Workshop on Domain-Specific Languages and High-Level Frameworks for High Performance Computing (WOLFHPC)}, title = {{Target-Specific Refinement of Multigrid Codes}}, pages = {52--57}, year = 2014, month = nov, date = {2014-11-17}, doi = {10.1109/WOLFHPC.2014.5}, organization = {IEEE} }
Code Generation from a Domain-specific Language for C-based HLS of Hardware Accelerators
@inproceedings{reiche2014hls, author = {Reiche, Oliver and Schmid, Moritz and Hannig, Frank and Membarth, Richard and Teich, Jürgen}, booktitle = {Proceedings of the International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS)}, venue = {New Dehli, India}, title = {{Code Generation from a Domain-specific Language for C-based HLS of Hardware Accelerators}}, pages = {17:1--17:10}, articleno = {17}, numpages = {10}, year = 2014, month = oct, date = {2014-10-12/2014-10-17}, doi = {10.1145/2656075.2656081}, organization = {ACM} }