📝 Publications

LLM Serving

EuroSys 2026
sym

KunServe: Efficient Parameter-centric Memory Management for LLM Serving
Rongxin Cheng, Yuxin Lai, Xingda Wei, Rong Chen, Haibo Chen

  • KunServe proposes the first parameter-centric approach to handling throttling by selectively dropping replicated parameters to instantly free memory for requests.