Thinking More about RDMA Memory Semantics

Abstract

RDMA (Remote Direct Memory Access) provides memory semantics to access the remote memory directly bypassing remote CPUs. It can provide low latency and high throughput that can benefit many data center applications. Though a lot of efforts had been made in the literature, this paper tries to find more opportunities to boost the performance of memory semantic operations in the RDMA network. Similar to the optimizations for local memory operations, we find that the performance can be improved in the RDMA network after considering the vector IO mechanism, the performance asymmetry between sequential and random access, IO consolidation, NUMA effects, as well as the atomic operations (such as compare and swap) provided by the underlying hardware. We have done a comprehensive empirical study on the influences from these factors for the memory semantic operations in RDMA network and provide guidelines to improve applications. Experimental results show that four typical applications, disaggregated hashtable, distributed shuffle, distributed join, and distributed log are improved by 2.7×/5.8×/5.3×/9.1× respectively after considering memory semantics related optimizations.

Publication
23th IEEE International Conference on Cluster Computing