From Prefix Cache to Fusion RAG Cache: Accelerating LLM Inference in Retrieval-Augmented

Abstract

Publication
The 2026 ACM Special Interest Group on Management of Data