RaTrace: Simple and Efficient Abstractions for BVH Ray Traversal Algorithms
Proceedings of 16th ACM SIGPLAN International Conference on Generative Programming: Concepts and Experiences (GPCE) , page 1-12.
Keywords: Computer Graphics, Ray Tracing, Functional Programming, Domain-Specific Languages
Abstract: In order to achieve the highest possible performance, the ray traversal and intersection routines at the core of every high-performance ray tracer are usually hand-coded, heavily optimized, and implemented separately for each hardware platform—even though they share most of their algorithmic core. The results are implementations that heavily mix algorithmic aspects with hardware and implementation details, making the code non-portable and difficult to change and maintain. In this paper, we present a new approach that offers the ability to define in a functional language a set of conceptual, high-level language abstractions that are optimized away by a special compiler in order to maximize performance. Using this abstraction mechanism we separate a generic ray traversal and intersection algorithm from its low-level aspects that are specific to the target hardware. We demonstrate that our code is not only significantly more flexible, simpler to write, and more concise but also that the compiled results perform as well as state-of-the-art implementations on any of the tested CPU and GPU platforms.
Foveated Real-Time Ray Tracing for Head-Mounted Displays
Computer Graphics Forum
Abstract: Head-mounted displays with dense pixel arrays used for virtual reality applications require high frame rates and low latency rendering. This forms a challenging use case for any rendering approach. In addition to its ability of generating realistic images, ray tracing offers a number of distinct advantages, but has been held back mainly by its performance. In this paper, we present an approach that significantly improves image generation performance of ray tracing. This is done by combining foveated rendering based on eye tracking with reprojection rendering using previous frames in order to drastically reduce the number of new image samples per frame. To reproject samples a coarse geometry is reconstructed from a G-Buffer. Possible errors introduced by this reprojection as well as parts that are critical to the perception are scheduled for resampling. Additionally, a coarse color buffer is used to provide an initial image, refined smoothly by more samples were needed. Evaluations and user tests show that our method achieves real-time frame rates, while visual differences compared to fully rendered images are hardly perceivable. As a result, we can ray trace non-trivial static scenes for the Oculus DK2 HMD at 1182 * 1464 per eye within the the VSync limits without perceived visual differences.
Shallow Embedding of DSLs via Online Partial Evaluation
Proceedings of the 14th International Conference on Generative Programming: Concepts & Experiences (GPCE) , page 11-20.
Keywords: shallow embedding; partial evaluation; domain-specific language
Abstract: This paper investigates shallow embedding of DSLs by means of online partial evaluation. To this end, we present a novel online partial evaluator for continuation-passing style languages. We argue that it has, in contrast to prior work, a predictable termination policy that works well in practice. We present our approach formally using a continuation-passing variant of PCF and prove its termination properties. We evaluate our technique experimentally in the field of visual and high-performance computing and show that our evaluator produces highly specialized and efficient code for CPUs as well as GPUs that matches the performance of hand-tuned expert code.
Annotation: Best Paper Award
Ettention: building blocks for iterative reconstruction algorithms
Proceedings of Microscopy & Microanalysis 2015
Abstract: We present a novel software package for tomographic reconstruction in electron microscopy, named Ettention. The software consists of a set of modular building-blocks for iterative reconstruction algorithms. Ettention simultaneously features (1) a modular, object-oriented software design, (2) optimized access to high-performance computing (HPC) platforms such as graphic processing units (GPU) or many-core architectures like Xeon Phi, and (3) accessibility to microscopy end-users via integration in the IMOD package and user interface. We provide developers with a clean application programming interface (API) that allows for extending the software easily and thus makes it an ideal platform for algorithmic research while hiding most of the technical details of high-performance computing. Several case studies are provided to demonstrate the feasibility of the concept.
Reconstruction Strategies for Combined Tilt- and Focal Series Scanning Transmission Electron Microscopy
Proceedings of Microscopy & Microanalysis 2015
Abstract: The STEM transform was thus formulated as a mathematical model applicable to STEM imaging with a convergent electron beam. It was shown that it is (1) a linear convolution, (2) a generalization of the Ray transform that contains the latter as the special case where the beam convergence semi-angle α→0, and (3) self-adjoint, a result that facilitated a new iterative reconstruction algorithm for TFS based on a matched backprojection, which drastically improved the convergence rate, resulting in 60 times less iterations compared to previous methods. It also solved theoretical concerns about the convergence of the method, which was not guaranteed in the case of an unmatched projection/backprojection pair. This brings the combined tilt- and focal series one more step towards broad applicability by allowing the reconstruction of high resolution tomograms in feasible computation time.
PSRT: Progressive Stochastic Reconstruction Technique for Cryo Electron Tomography
Proceedings of Microscopy & Microanalysis 2015
On a novel approach to 3D reconstruction in Cryo Electron Tomography: Progressive Stochastic Reconstruction Technique (PSRT)
14. French Microscopy Congress, Nice, France, 30.6. - 3.7.2015
Abstract: Cryo Electron Tomography (cryoET) plays an essential role in Structural Biology, as it is the only technique that allows us to study structure of macromolecular complexes in their close to native environment in-situ. The reconstruction process faces many challenges as the input projections suffer from very low signal-to-noise ratio and limited tilt angle. Moreover, the scanned specimen is larger than the detector, which introduces the interior problem into the reconstruction process. High-resolution protocols such as Subtomogram Averaging (SA) can alleviate some of these limitations; however, in order to be fully automatic they require reconstructions of high quality. Current state-of-the-art methods, such as Weighted Back Projection (WBP) or Simultaneous Iterative Reconstruction Technique (SIRT), deliver reconstructions that often require manual intervention during SA. We present a novel iterative approach to the tomographic reconstruction problem called Progressive Stochastic Reconstruction Technique (PSRT). The method is based on Monte Carlo random walks guided by a sampling strategy similar to the Metropolis-Hastings strategy. PSRT is designed to suit the specific conditions in cryoET - it delivers high-contrast reconstructions without any loss of high-resolution structural information and it implements memory efficient solution to the interior problem. Finally, it can be easily incorporated into a typical SA pipeline, where it significantly improves template-based localization and provides an elegant solution to the region-of-interest reconstruction.
Plugin free remote visualization in the browser
Volume 9397 of Proc. SPIE
Keywords: Visualization, Video, Computing systems, Internet, Mobile devices, Personal digital assistants, Tablets
Abstract: Today, users access information and rich media from anywhere using the web browser on their desktop computers, tablets or smartphones. But the web evolves beyond media delivery. Interactive graphics applications like visualization or gaming become feasible as browsers advance in the functionality they provide. However, to deliver large-scale visualization to thin clients like mobile devices, a dedicated server component is necessary. Ideally, the client runs directly within the browser the user is accustomed to, requiring no installation of a plugin or native application. In this paper, we present the state-of-the-art of technologies which enable plugin free remote rendering in the browser. Further, we describe a remote visualization system unifying these technologies. The system transfers rendering results to the client as images or as a video stream. We utilize the upcoming World Wide Web Consortium (W3C) conform Web Real-Time Communication (WebRTC) standard, and the Native Client (NaCl) technology built into Chrome, to deliver video with low latency.
Target-Specific Refinement of Multigrid Codes
Proceedings of the 4th International Workshop on Domain-Specific Languages and High-Level Frameworks for High Performance Computing (WOLFHPC) , page 52-57.
Keywords: multigrid codes; partial evaluation; domain- specific language
Abstract: This paper applies partial evaluation to stage a stencil code Domain-Specific Language (DSL) onto a functional and imperative programming language. Platform-specific primitives such as scheduling or vectorization, and algorithmic variants such as boundary handling are factored out into a library that make up the elements of that DSL. We show how partial evaluation can eliminate all overhead of this separation of concerns and creates code that resembles hand-crafted versions for a particular target platform. We evaluate our technique by implementing a DSL for the V-cycle multigrid iteration. Our approach generates code for AMD and NVIDIA GPUs (via SPIR and NVVM) as well as for CPUs using AVX/AVX2 alike from the same high-level DSL program. First results show that we achieve a speedup of up to 3× on the CPU by vectorizing multigrid components and a speedup of up to 2× on the GPU by merging the computation of multigrid components.
Specialization through Dynamic Staging
Proceedings of the 13th International Conference on Generative Programming: Concepts & Experiences (GPCE) , page 103-112.
Keywords: dynamic staging; partial evaluation; code specialization
Abstract: Partial evaluation allows for specialization of program fragments. This can be realized by staging, where one fragment is executed earlier than its surrounding code. However, taking advantage of these capabilities is often a cumbersome endeavor. In this paper, we present a new metaprogramming concept using staging parameters that are first-class citizen entities and define the order of execution of the program. Staging parameters can be used to define MetaML-like quotations, but can also allow stages to be created and resolved dynamically. The programmer can write generic, polyvariant code which can be reused in the context of different stages. We demonstrate how our approach can be used to define and apply domain-specific optimizations. Our implementation of the proposed metaprogramming concept generates code which is on a par with templated C++ code in terms of execution time.
Combined tilt- and focal series scanning transmission electron microscopy: TFS 3D STEM
Proceedings of 18th International Microscopy Congress
TFS: Combined Tilt- and Focal Series for Scanning Transmission Electron Microscopy.
Proceedings of Microscopy & Microanalysis 2014