Automatic Mapping and Optimization to Kokkos with Polyhedral Compilation

In the post-Moore’s Law era, the quest for exascale computing has resulted in diverse hardware architecture trends, including novel custom and/or specialized processors to accelerate the systems, asynchronous or self-timed computing cores, and near-memory computing architectures. To contend with such heterogeneous and complex hardware targets, there have been advanced software solutions in the form of new programming models and runtimes. However, using these advanced programming models poses productivity and performance portability challenges. This work takes a significant step towards addressing the performance, productivity, and performance portability challenges faced by the high-performance computing and exascale community. We present an automatic mapping and optimization framework that takes sequential code and automatically generates high-performance parallel code in Kokkos, a performance portable parallel programming model targeted for exascale computing. We demonstrate the productivity and performance benefits of optimized mapping to Kokkos using kernels from a critical application project on climate modeling, the Energy Exascale Earth System Model (E3SM) project. This work thus shows that automatic generation of Kokkos code enhances the productivity of application developers and enables them to fully utilize the benefits of a programming model such as Kokkos.