Achieving Sub-second Pairwise Query over Evolving Graphs

Abstract

Many real-time OLAP systems have been proposed to query evolving data with sub-second latency. Although this feature is highly attractive, it is very hard to be achieved on analytic graph queries that can only be answered after accessing every connected vertex. Fortunately, researchers recently observed that answering pairwise queries is enough for many real-world scenarios. These pairwise queries avoid the exhaustive nature and hence may only need to access a small portion of the graph. Obviously, the crux of achieving low latency is to what extent the system can eliminate unnecessary computations. This pruning process, according to our investigation, is usually achieved by estimating certain upper bounds of the query result in existing systems.

However, our evaluation results demonstrate that these existing upper-bound-only pruning techniques can only prune about half of the vertex activations, which is still far away from achieving the sub-second latency goal on large graphs. In contrast, we found that it is possible to substantially accelerate the processing if we are able to not only estimate the upper bounds, but also foresee a tighter lower bound for certain pairs of vertices in the graph. Our experiments show that only less than 1% of the vertices are activated via using this novel lower bound based pruning technique. Based on this observation, we build SGraph, a system that is able to answer dynamic pairwise queries over evolving graphs with sub-second latency. It can ingest millions of updates per second and simultaneously answer pairwise queries with a latency that is several orders of magnitude smaller than state-of-the-art systems.

Publication
28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems