Search

MADSys
MADSys
  • Home
  • News
  • People
  • Projects
  • Publications

Yuening Zhu

Ph.D.

MADSys

Publications
  • From Prefix Cache to Fusion RAG Cache: Accelerating LLM Inference in Retrieval-Augmented

    Jiahao Wang , Weiyu Xie , Mingxing Zhang , Boxin Zhang , Jianwei Dong , Yuening Zhu , Chen Lin , Jingqi Tang , Yaochen Han , Zhiyuan Ai , Xianglin Chen , Yongwei Wu , Congfeng Jiang

    The 2026 ACM Special Interest Group on Management of Data (SIGMOD 26)

  • KTransformers: Unleashing the Full Potential of CPU/GPU Hybrid Inference for MoE Models

    Hongtao Chen , Weiyu Xie , Boxin Zhang , Jingqi Tang , Jiahao Wang , Jianwei Dong , Shaoyuan Chen , Ziwei Yuan , Chen Lin , Chengyu Qiu , Yuening Zhu , Qingliang Ou , Jiaqi Liao , Xianglin Chen , Zhiyuan Ai , Yongwei Wu , Mingxing Zhang

    The 31st Symposium on Operating Systems Principles (SOSP 2025)

Cite
Copy Download