Trading off Latency for Memory Bandwidth in Spatial Architectures using the Polyhedral Model
Publication Source: 2020 International Conference on High Performance Computing & Simulation (HPCS 2020)
We present a fast algorithm for transforming loop nests represented in the polyhedral model, which have been placed onto a spatial grid of processors. The goal of the transformation is to avoid broadcasts, which can put extra pressure on remote memory bandwidth, and turn them mainly into neighbor-to-neighbor reuse. The transformation preserves existing parallelism and locality properties of the original program as much as possible, and in particular does not reduce its amount of parallelism.