KnightKing: A Fast Distributed Graph Random Walk Engine

Abstract

Random walk on graphs has recently gained immense popularity as a tool for graph data analytics and machine learning. Currently, random walk algorithms are developed as individual implementations and suffer significant performance and scalability problems, especially with the dynamic nature of sophisticated walk strategies.

We present KnightKing, the first general-purpose, distributed graph random walk engine. To address the unique interaction between a static graph and many dynamic walkers, it adopts an intuitive walker-centric computation model. The corresponding programming model allows users to easily specify existing or new random walk algorithms, facilitated by a new unified edge transition probability definition that applies across popular known algorithms. With KnightKing, these diverse algorithms benefit from its common distributed random walk execution engine, centered around an innovative rejection-based sampling mechanism that dramatically reduces the cost of higher-order random walk algorithms. Our evaluation confirms that KnightKing brings up to 4 orders of magnitude improvement in executing algorithms that currently can only be afforded with approximation solutions on large graphs.

Publication
27th ACM Symposium on Operating Systems Principles