Systems and methods for fast detection of elephant flows in network traffic

In a system for efficiently detecting large/elephant flows in a network, the rate at which the received packets are sampled is adjusted according to a top flow detection likelihood computed for a cache of flows identified in the arriving network traffic. After observing packets sampled from the network, Dirichlet-Categorical inference is employed to calculate a posterior distribution that captures uncertainty about the sizes of each flow, yielding a top flow detection likelihood. The posterior distribution is used to find the most likely subset of elephant flows. The technique rapidly converges to the optimal sampling rate at a speed O(1/n), where n is the number of packet samples received, and the only hyperparameter required is the targeted detection likelihood.