VertexSurge: Variable Length Graph Pattern Match on Billion-edge Graphs

Abstract

Variable-Length Graph Pattern Matching (VLGPM) is a critical functionality in graph databases, pivotal for identifying patterns where the number of connecting edges between two matched vertices is variable. This function plays a vital role in analyzing complex and dynamic networks such as social networks or bank transfers networks, where relationships can vary extensively in both length and structure. However, despite its importance, current graph databases, optimized primarily for single-hop subgraph matching, struggle with VLGPM over large graphs.

To bridge this gap between essential user requirements and the lack of efficient support in existing systems, we introduce VertexSurge. Central to VertexSurge is an innovative variable-length expand (VExpand) operator, which incorporates several microarchitecture-friendly optimizations to efficiently compute the reachability matrix between two sets of vertices. These optimizations enable VertexSurge to handle the surge of vertex count due to variable length with high performance. Additionally, VertexSurge combines VExpand with effective multi-set intersection for pattern matching, ruled-based planning, and disk offloading for large datasets, to implement a full-fledged VLGPM engine. Our evaluations with real-world graph datasets and representative patterns demonstrate that VertexSurge significantly outperforms existing systems in VLGPM, validating its efficacy in handling large-scale graph pattern matching challenges.

Publication
The 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems