High Performance Compilers



Systems and Methods for Footprint Based Scheduling



Publication Source: Patent US10095494B2

A system can generate and impose constraints on a compiler/scheduler so as to specifically minimize the footprints of one or more program variables. The constraints can be based on scopes of the variables and/or on dependence distances between statements specifying operations that use the one or more program variables.
Google Scholar    Article

Systems and Methods for Efficient Determination of Task Dependences After Loop Tiling



Publication Source: Patent US9613163B2

A compilation system can compile a program to be executed using an event driven tasks (EDT) system that requires knowledge of dependencies between program statement instances, and generate the required dependencies efficiently when a tiling transformation is applied. To this end, the system may use pre-tiling dependencies and can derive post-tiling dependencies via an analysis of the tiling to be applied.
Google Scholar    Article

Methods and Apparatus for Data Transfer Optimization



Publication Source: Patent US9858053B2

Methods, apparatus and computer software product for optimization of data transfer between two memories includes determining access to master data stored in one memory and/or to local data stored in another memory such that either or both of the size of total data transferred and the number of data transfers required to transfer the total data can be minimized. The master and/or local accesses are based on, at least in part, respective structures of the master and local data.
Google Scholar    Article

Methods and Apparatus for Automatic Communication Optimizations in a Compiler Based on a Polyhedral Representation



Publication Source: Patent US9830133B1

Methods, apparatus and computer software product for source code optimization are provided. In an exemplary embodiment, a first custom computing apparatus is used to optimize the execution of source code on a second computing apparatus. In this embodiment, the first custom computing apparatus contains a memory, a storage medium and at least one processor with at least one multi-stage execution unit. The second computing apparatus contains at least one local memory unit that allows for data reuse opportunities. The first custom computing apparatus optimizes the code for reduced communication execution on the second computing apparatus. This Abstract is provided for the sole purpose of complying with the Abstract requirement rules. This Abstract is submitted with the explicit understanding that it will not be used to interpret or to limit the scope or the meaning of the claims.
Google Scholar    Article

Polyhedral Optimization of TensorFlow Computation Graphs



Publication Source: The 6th Workshop on Extreme-scale Programming Tools (ESPT-2017) at The International Conference for High Performance Computing, Networking, Storage and Analysis (SC17)

We present R-Stream·TF, a polyhedral optimization tool for neural network computations. R-Stream·TF transforms computations performed in a neural network graph into C programs suited to the polyhedral representation and uses R-Stream, a polyhedral compiler, to parallelize and optimize the computations performed in the graph. R-Stream·TF can exploit the optimizations available with R-Stream to generate a highly optimized version of the computation graph, specifically mapped to the targeted architecture. During our experiments, R-Stream·TF was able to automatically reach performance levels close to the hand-optimized implementations, demonstrating its utility in porting neural network computations to parallel architectures.

Google Scholar    Article

1 2 3 9