Publication Source: 2019 International Conference on High Performance Computing & Simulation (HPCS), Dublin, Ireland
Compiler optimizations based on the polyhedral
model are able to automatically parallelize and optimize loopbased
code. We acknowledge that while polyhedral techniques
can represent a broad set of program transformations, important
classes of programs could be parallelized just as well using less
general but more tractable techniques. We apply this general idea
to the polyhedral scheduling phase, which is one of the typical
performance bottlenecks of polyhedral compilation.
We focus on a class of programs in which enough parallelism is
already exposed in the source program, and which includes Deep
Learning layers and combinations thereof, as well as multilinear
algebra kernels. We call these programs ”tensor codes”, and
consequently call ”tensor schedulers” the tractable polyhedral
scheduling techniques presented here.
The general idea is that we can significantly speed up
polyhedral scheduling by restricting the set of transformations
considered. As an extra benefit, having a small search space
allows us to introduce non-linear cost models, which fills a gap
in polyhedral cost models.