Publications/Data Analytics

Systems and methods for selective expansive recursive tensor analysis

A system for performing tensor decomposition in a selective expansive and/or recursive manner, a tensor is decomposed into a specified number of components, and one or more tensor components are selected for further decomposition. For each selected component, the significant elements thereof are identified, and using the indices of the

Read More »

Large–scale Sparse Tensor Decomposition Using a Damped Gauss–Newton Method

CANDECOMP/PARAFAC (CP) tensor decomposition is a popular unsupervised machine learning method with numerous applications. This process involves modeling a high–dimensional, multi–modal array (a tensor) as the sum of several low–dimensional components. In order to decompose a tensor, one must solve an optimization problem, whose objective is often given by the

Read More »

Combining Tensor Decompositions and Graph Analytics to Provide Cyber Situational Awareness at HPC Scale

This paper describes MADHAT (Multidimensional Anomaly Detection fusing HPC, Analytics, and Tensors), an integrated workflow that demonstrates the applicability of HPC resources to the problem of maintaining cyber situational awareness. MADHAT combines two high-performance packages: ENSIGN for large-scale sparse tensor decompositions and HAGGLE for graph analytics. Tensor decompositions isolate coherent

Read More »

Fast and Scalable Distributed Tensor Decompositions

Tensor decomposition is a prominent technique for analyzing multi-attribute data and is being increasingly used for data analysis in different application areas. Tensor decomposition methods are computationally intense and often involve irregular memory accesses over large-scale sparse data. Hence it becomes critical to optimize the execution of such data intensive

Read More »

Enhancing Network Visibility and Security through Tensor Analysis

The increasing size, variety, rate of growth and change, and complexity of network data has warranted advanced network analysis and services. Tools that provide automated analysis through traditional or advanced signature-based systems or machine learning classifiers suffer from practical difficulties. These tools fail to provide comprehensive and contextual insights into

Read More »

Computationally Efficient CP Tensor Decomposition Update Framework for Emerging Component Discovery in Streaming Data

We present streaming CP update, an algorithmic framework for updating CP tensor decompositions that possesses the capability of identifying emerging components and can produce decompositions of large, sparse tensors streaming along multiple modes at a low computational cost. We discuss a large-scale implementation of the proposed scheme integrated within the

Read More »

All-at-once Decomposition of Coupled Billion-scale Tensors in Apache Spark

As the scale of unlabeled data rises, it becomes increasingly valuable to perform scalable, unsupervised data analysis. Tensor decompositions, which have been empirically successful at finding meaningful cross-dimensional patterns in multidimensional data, are a natural candidate to test for scalability and meaningful pattern discovery in these massive real-world datasets. Furthermore,

Read More »

High Speed Elephant Flow Detection Under Partial Information

In this paper we introduce a new framework to detect elephant flows at very high speed rates and under uncertainty. The framework provides exact mathematical formulas to compute the detection likelihood and introduces a new flow reconstruction lemma under partial information. These  theoretical results lead to the design of BubbleCache,

Read More »