Graphics Processing Units (GPUs) recently became general enough to enable implementation of a variety of light transport algorithms. However, the efficiency of these GPU implementations has received relatively little attention in the research literature and no systematic study on the topic exists to date. The goal of our work is to fill this gap. Our main contribution is a comprehensive and in-depth investigation of the efficiency of the GPU implementation of a number of classic as well as more recent progressive light transport simulation algorithms. We present several improvements over the state-of-the-art. In particular, our Light Vertex Cache, a new approach to mapping connections of sub-path vertices in Bidirectional Path Tracing on the GPU, outperforms the existing implementations by 30-60%. We also describe a first GPU implementation of the recently introduced Vertex Connection and Merging algorithm [Georgiev et al. 2012], showing that even relatively complex light transport algorithms can be efficiently mapped on the GPU. With the implementation of many of the state-of-the-art algorithms within a single system at our disposal, we present a unique direct comparison and analysis of their relative performance.