Existing parallel out-of-core graph processing systems focus on improving disk I/O locality, which leads to restrictions on their programming models. Although improving the locality, these constraints also restrict the expressiveness and hence only sub-optimal algorithms are supported. These sub-optimal algorithms typically incur sequential, but much larger, amount of disk I/O. In this paper, we explore a fundamentally different tradeoff: less total amount of I/O rather than better locality. We show that out-of-core graph processing systems uniquely provide the opportunities to lift the restrictions of the programming model in a feasible manner. To demonstrate the ideas, we build Clip, which enables more efficient algorithms that require much less amount of total disk I/O. Our experiments show that the algorithms that can be only implemented in Clip are much faster than the original disk-locality-optimized algorithms. We also further extend our technique’s scope of application by providing a semi-external mode. Our analysis and evaluation demonstrate that semi-external is not only feasible for many cases, but also be able to deliver a significant speedup for important graph applications. Moreover, we further improve the performance of originally supported applications by designing more optimizations and evaluate our system on NVMe SSD.