# Publications

### Computing Bottleneck Structures at Scale for High-Precision Network Performance Analysis

The Theory of Bottleneck Structures is a recently developed framework for studying the performance of data networks. It describes how local perturbations in one part of the network propagate and interact with others. This framework is a powerful analytical tool that allows network operators to make accurate predictions about network

### Systems and methods for selective expansive recursive tensor analysis

A system for performing tensor decomposition in a selective expansive and/or recursive manner, a tensor is decomposed into a specified number of components, and one or more tensor components are selected for further decomposition. For each selected component, the significant elements thereof are identified, and using the indices of the

### Systems and methods for scalable hierarchical polyhedral compilation

A system for compiling programs for execution thereof using a hierarchical processing system having two or more levels of memory hierarchy can perform memory-level-specific optimizations, without exceeding a specified maximum compilation time. To this end, the compiler system employs a polyhedral model and limits the dimensions of a polyhedral program

### Multiscale Data Analysis Using Binning, Tensor Decompositions, and Backtracking

Large data sets can contain patterns at multiple scales (spatial, temporal, etc.). In practice, it is useful for data exploration techniques to detect patterns at each relevant scale. In this paper, we develop an approach to detect activities at multiple scales using tensor decomposition, an unsupervised high-dimensional data analysis technique

### Automatic Mapping and Optimization to Kokkos with Polyhedral Compilation

In the post-Moore’s Law era, the quest for exascale computing has resulted in diverse hardware architecture trends, including novel custom and/or specialized processors to accelerate the systems, asynchronous or self-timed computing cores, and near-memory computing architectures. To contend with such heterogeneous and complex hardware targets, there have been advanced software

### Large–scale Sparse Tensor Decomposition Using a Damped Gauss–Newton Method

CANDECOMP/PARAFAC (CP) tensor decomposition is a popular unsupervised machine learning method with numerous applications. This process involves modeling a high–dimensional, multi–modal array (a tensor) as the sum of several low–dimensional components. In order to decompose a tensor, one must solve an optimization problem, whose objective is often given by the

### Approximate Inverse Chain Preconditioner: Iteration Count Case Study for Spectral Support Solvers

As the growing availability of computational power slows, there has been an increasing reliance on algorithmic advances. However, faster algorithms alone will not necessarily bridge the gap in allowing computational scientists to study problems at the edge of scientific discovery in the next several decades. Often, it is necessary to

### HACCLE: An Ecosystem for Building Secure Multi-Party Computations

Cryptographic techniques have the potential to enable distrusting parties to collaborate in fundamentally new ways, but their practical implementation poses numerous challenges. An important class of such cryptographic techniques is known as secure multi-party computation (MPC). In an effort to provide an ecosystem for building secure MPC applications using higher

### Systems and methods for stencil amplification

In a sequence of major computational steps or in an iterative computation, a stencil amplifier can increase the number of data elements accessed from one or more data structures in a single major step or iteration, thereby decreasing the total number of computations and/or communication operations in the overall sequence