SeqDLM: a Sequencer-based Distributed Lock Manager for Efficient Shared File Access in a Parallel File System

Abstract

Distributed locks are used to guarantee the distributed client-cache coherence in parallel file systems. However, they lead to poor performance in the case of parallel writes under high-contention workloads. We analyze the distributed lock manager and find out that lock conflict resolution is the root cause of the poor performance, which involves frequent lock revocations and slow data flushing from client caches to data servers. We design a distributed lock manager named SeqDLM by exploiting the sequencer mechanism. SeqDLM mitigates the lock conflict resolution overhead using early grant and early revocation while keeping the same semantics as traditional distributed locks. To evaluate SeqDLM, we have implemented a parallel file system called ccPFS using both SeqDLM and traditional distributed locks. Evaluations on 96 nodes show SeqDLM outperforms the traditional distributed locks by up to 10.3× for high-contention parallel writes on a shared file with multiple stripes.

Publication
The International Conference for High Performance Computing, Networking, Storage, and Analysis