Code Generation for Embedded Heterogeneous Architectures on Android

The success of Android is based on its unified Java programming model that allows to write platform-independent programs for a variety of different target platforms. However, this comes at the cost of performance. As a consequence, Google introduced APIs that allow to write native applications and to exploit multiple cores as well as embedded GPUs for compute-intensive parts. This paper proposes code generation techniques in order to target the Renderscript and Filterscript APIs. Renderscript harnesses multi-core CPUs and unified shader GPUs, while the more restricted Filterscript also supports GPUs with earlier shader models. Our techniques focus on image processing applications and allow to target these APIs and OpenCL from a common description. We further supersede memory transfers by sharing the same memory region among different processing elements on HSA platforms. As reference, we use an embedded platform hosting a multi-core ARM CPU and an ARM Mali GPU. We show that our generated source code is faster than native implementations in OpenCV as well as the pre-implemented script intrinsics provided by Google for acceleration on the embedded GPU.

  author       = {Membarth, Richard and Reiche, Oliver and Hannig, Frank and Teich, Jürgen},
  address      = {Dresden, Germany},
  booktitle    = {Proceedings of the Conference on Design, Automation and Test in Europe (DATE)},
  title        = {{Code Generation for Embedded Heterogeneous Architectures on Android}},
  pages        = {86:1--86:6},
  year         = 2014,
  month        = mar,
  date         = {2014-03-24/2014-03-28},
  doi          = {10.7873/DATE.2014.099},
  organization = {IEEE}