September 24, 2019
We present a fast, large-scale algorithm for the
simulation of electromagnetic waves (Maxwell’s equations) in
three-dimensional inhomogeneous media. The algorithm has a
complexity of O(N log(N)) and runs in parallel. Numerical
simulations show the rapid treatment of problems with tens of
millions of unknowns on a small shared-memory cluster ( 16
September 24, 2019
The Combinatorial Multigrid (CMG) technique is a
practical and adaptable solver and combinatorial preconditioner
for solving certain classes of large, sparse systems of linear equations.
CMG is similar to Algebraic Multigrid (AMG) but replaces
large groupings of fine-level variables with a single coarse-level
one, resulting in simple and fast interpolation schemes. These
schemes further provide control over the refinement strategies at
different levels of the solver hierarchy depending on the condition
number of the system being solved . While many pre-existing
solvers may be able to solve large, sparse systems with relatively
low complexity, inversion may require O(n2) space; whereas, if
we know that a linear operator has ~n = O(n) nonzero elements,
we desire to use O(n) space in order to reduce communication
as much as possible.
Being able to invert sparse linear systems of equations, asymptotically
as fast as the values can be read from memory, has been
identified by the Defense Advanced Research Projects Agency
(DARPA) and the Department of Energy (DOE) as increasingly
necessary for scalable solvers and energy-efficient algorithms ,
 in scientific computing. Further, as industry and government
agencies move towards exascale, fast solvers and communicationavoidance
will be more necessary , . In this paper, we present
an optimized implementation of the Combinatorial Multigrid in
C using Petsc and analyze the solution of various systems using
the CMG approach as a preconditioner on much larger problems
than have been presented thus far. We compare the number of
iterations, setup times and solution times against other popular
preconditioners for such systems, including Incomplete Cholesky
and a Multigrid approach in Petsc against common problems,
further exhibiting superior performance by the CMG. 1 2
Index Terms—combinatorial algorithms, spectral support
solver, linear systems, fast solvers, preconditioners, multigrid,
graph laplacian, benchmarking, iterative solvers
July 15, 2019
Compiler optimizations based on the polyhedral
model are able to automatically parallelize and optimize loopbased
code. We acknowledge that while polyhedral techniques
can represent a broad set of program transformations, important
classes of programs could be parallelized just as well using less
general but more tractable techniques. We apply this general idea
to the polyhedral scheduling phase, which is one of the typical
performance bottlenecks of polyhedral compilation.
We focus on a class of programs in which enough parallelism is
already exposed in the source program, and which includes Deep
Learning layers and combinations thereof, as well as multilinear
algebra kernels. We call these programs ”tensor codes”, and
consequently call ”tensor schedulers” the tractable polyhedral
scheduling techniques presented here.
The general idea is that we can significantly speed up
polyhedral scheduling by restricting the set of transformations
considered. As an extra benefit, having a small search space
allows us to introduce non-linear cost models, which fills a gap
in polyhedral cost models.
May 23, 2019
Positive linear programming (PLP), also known as packing and covering linear programs, is an important class of problems frequently found in fields such as network science, operations research, or economics. In this work we demonstrate that all PLP problems can be represented using a network structure, revealing new key insights that lead to new polynomial-time algorithms.
Google Scholar • Article
April 8, 2019
Full packet capture (FPC) consists in capturing all packets and storing them into permanent storage to enable offline forensic analysis. FPC however suffers from a scalability issue: at today's normal traffic speed rates of 10Gbps or above, it either becomes intractable or requires highly expensive hardware both in processing and storage, which rapidly decreases the economic viability of the technology.
The first good news is that for many practical cases, full packet capture is not necessary. This rationale stems from the well-known law of heavy tailed traffic: from an analysis standpoint, most of the interesting features found in network traffic—such as a network attack, although not limited to it—are found in a very small fraction of it. Further, in some cases full packet capture is not only unnecessary but could represent a liability as sensitive information is kept in non-ephemeral storage. The second good news is that all the heavy lifting done by Zeek in processing network traffic can be leveraged to overcome both the intractability and the liability problems. Indeed, Zeek can be brought into the loop to perform selective packet capture (SPC), a process by which the Zeek workers themselves decide which traffic must be stored into disk in a selective and fine granular manner.
In this talk Reservoir Labs will present a workflow to perform selective packet capture using the Zeek sensor at very high speed rates. The workflow allows Zeek scripts to directly trigger packet captures based on the real time analysis of the traffic itself. We will describe key data structures needed to efficiently perform this task and introduce several Zeek scripts and use cases illustrating how SPC can be used to capture just the necessary packets to enable meaningful forensic analysis while minimizing the exposure to the liability risk.
Contact us to receive a copy of this presentation or for a demonstration.