📝 Publications
LLM Serving
EuroSys 2026

KunServe: Efficient Parameter-centric Memory Management for LLM Serving
Rongxin Cheng, Yuxin Lai, Xingda Wei, Rong Chen, Haibo Chen
- KunServe proposes the first parameter-centric approach to handling throttling by selectively dropping replicated parameters to instantly free memory for requests.