AnySeq: A High Performance Sequence Alignment Library based on Partial Evaluation

Sequence alignments are fundamental to bioinformatics which has resulted in a variety of optimized implementations. Unfortunately, the vast majority of them are hand-tuned and specific to certain architectures and execution models. This not only makes them challenging to understand and extend, but also difficult to port to other platforms. We present AnySeq - a novel library for computing different types of pairwise alignments of DNA sequences. Our approach combines high performance with an intuitively understandable implementation, which is achieved through the concept of partial evaluation. Using the AnyDSL compiler framework, AnySeq enables the compilation of algorithmic variants that are highly optimized for specific usage scenarios and hardware targets with a single, uniform codebase. The resulting domain-specific library thus allows the variation of alignment parameters (such as alignment type, scoring scheme, and traceback vs. plain score) by simple function composition rather than metaprogramming techniques which are often hard to understand. Our implementation supports multithreading and SIMD vectorization on CPUs, CUDA-enabled GPUs, and FPGAs. AnySeq is at most 7% slower and in many cases faster (up to 12%) than state-of-the art manually optimized alignment libraries on CPUs (SeqAn) and on GPUs (NVBio).

  author          = {Müller, André and Schmidt, Bertil and Hildebrandt, Andreas and Membarth, Richard and Leißa, Roland and Kruse, Matthis and Hack, Sebastian},
  address         = {New Orleans, LA, USA},
  booktitle       = {Proceedings of the 34th IEEE International Parallel \& Distributed Processing Symposium (IPDPS)},
  title           = {{AnySeq}: A High Performance Sequence Alignment Library based on Partial Evaluation},
  pages           = {1030--1040},
  year            = 2020,
  month           = may,
  date            = {2020-05-18/2020-05-22},
  doi             = {10.1109/IPDPS47924.2020.00109},
  organization    = {IEEE}