High-speed volume ray casting with CUDA
Volume ray casting experiences a renewed interest in the last decade. Largely due to the graphics hardware, which enabled realtime implementations competitive in speed with slicing.
However these implementations [3, 2, 4] need specialized shader languages and are forced to use graphics APIs. It makes implementation of advanced methods difficult and hinders performance, bending the programming and execution model for something it was not designed to.
In late 2006 a new generation of GPUs has been introduced together with CUDA, C-language API . CUDA exposes the hardware not as a streaming graphics processing pipeline but as a general-purpose highly parallel co-processor. We aim at evaluating this increased flexibility versus any performance loss or missing low-level access to the hardware. For our study we have chosen basic ray casting using regular sampling with front-to-back traversal, pre-integration and early ray termination.