đź‘‹ Call me Joshua.


🎓 I’m an incoming CS Ph.D. student at CSAIL, MIT EECS. I graduated from the University of Chicago with a Bachelor of Science in Mathematics (with Specialization in Economics) and Computer Science.
Research
👨‍💻 My research interests lie broadly in computer systems and machine learning. I build more efficient and reliable systems, both tailored to and powered by state-of-the-art machine learning algorithms, to improve performance, resource efficiency, affordability, and/or scalability. My recent focus is on ML/LLM inference and AI infrastructure.
✏️ During my undergrad years, I began research in math. Advised by Prof. Gregory Lawler and Jinwoo Sung, I worked on probability theory, particularly the loop-erased random walk (LERW), Loop Soups, Gaussian Free Field (GFF), and Schramm–Loewner Evolution (SLE). Later, I was fortunate to work with Prof. Junchen Jiang and Prof. Kexin Pei at UChicago, Prof. Ravi Netravali at Princeton, and Dr. Ganesh Ananthanarayanan at Microsoft Research on MLSys, with a focus on systems for efficient ML/LLM inference.
Recent Projects
I’m currently working on the following open-source projects as a member of LMCache Lab:
🚀LMCache: The first open-source Knowledge Delivery Network (KDN) that accelerates LLM applications up to 8x faster, at 8x lower cost
- LMCache Project is open-source! Check it out!
- Website: LMCache Website
- Technical reports: CacheGen (SIGCOMM’24) and CacheBlend (EuroSys’25 Best Paper).
🚀vLLM Production Stack: Scale from a single vLLM instance to a distributed vLLM deployment without changing any application code
- vLLM Production Stack Project is open-source! Check it out!
Past Projects
🚀Resource Allocation for Multi-Tenant Retrieval-Augmented Generation (RAG) Systems
Check it out here!
🚀KV Cache Compression and Streaming for Multimodal Large Language Models (MLLMs)
🚀Knowledge Streaming from LLMs to Environments
Check it out here!
Selected Publications
*: Equal Contribution.
Yuhan Liu, Yuyang Huang, Jiayi Yao, Zhuohan Gu, Kuntai Du, Hanchen Li, Yihua Cheng, Junchen Jiang, Shan Lu, Madan Musuvathi, Esha Choukse.
DroidSpeak: KV Cache Sharing for Efficient Multi-LLM Serving
Under review at a major systems conference [Paper]Zhuohan Gu*, Jiayi Yao*, Kuntai Du, Junchen Jiang.
LLMSteer: Improving Long-Context LLM Inference by Steering Attention on Reused Contexts
NeurIPS 2024 workshop on Machine Learning for Systems [Paper / Poster]Siddhant Ray, Rui Pan, Zhuohan Gu, Kuntai Du, Ganesh Ananthanarayanan, Ravi Netravali, Junchen Jiang.
RAGServe: Fast Quality-Aware RAG Systems with Configuration Adaptation
Under review at a major systems conference [Paper]Siddhant Ray, Zhuohan Gu, Xi Jiang, Junchen Jiang, Nick Feamster.
Transformer-based Predictions for Sudden Network Changes
NSDI 2024 Poster Session [Poster]
All Publications
Expand
- Yuhan Liu, Yuyang Huang, Jiayi Yao, Zhuohan Gu, Kuntai Du, Hanchen Li, Yihua Cheng, Junchen Jiang, Shan Lu, Madan Musuvathi, Esha Choukse.
DroidSpeak: KV Cache Sharing for Efficient Multi-LLM Serving
Under review at a major systems conference [Paper] - Zhuohan Gu*, Jiayi Yao*, Kuntai Du, Junchen Jiang.
LLMSteer: Improving Long-Context LLM Inference by Steering Attention on Reused Contexts
NeurIPS 2024 workshop on Machine Learning for Systems [Paper / Poster] - Siddhant Ray, Rui Pan, Zhuohan Gu, Kuntai Du, Ganesh Ananthanarayanan, Ravi Netravali, Junchen Jiang.
RAGServe: Fast Quality-Aware RAG Systems with Configuration Adaptation
Under review at a major systems conference [Paper] - Siddhant Ray, Zhuohan Gu, Xi Jiang, Junchen Jiang, Nick Feamster.
Transformer-based Predictions for Sudden Network Changes
NSDI 2024 Poster Session [Poster] - Zhuohan Gu, Dadu Chen.
An Introduction to Loewner Energy
UChicago Math REU 2024 [Paper] - Zhuohan Gu.
A Study in Markov Chains, Loop-Erased Random Walk, and Loop Soups
UChicago Math REU 2023 [Paper]
Selected Awards
- Quad Undergraduate Research Conference Grant (supports faculty-mentored undergraduate participation in presenting papers at academic conferences), Chicago, IL, 10/2024
- Jeff Metcalf Fellowship Grant (supports students’ career goals and offsets living expenses during internships), Chicago, IL, 05/2024
- Honor Roll (maintained an overall academic average of 93% or above), Washington, D.C., US, 2018-20
- The Bijali Dutta Ghosh Book Award (awarded for commitment to and ability in the Natural Sciences), Washington, D.C., US, 2020
- Goldberg Science Award (for achievement in the sciences outside of school), Washington, D.C., US, 2020
- First Place Award, American Mathematics Competition (AMC) 12, Washington, D.C., US, 2019
Education
- B.S., University of Chicago, Mathematics and Computer Science, 2022-2025
More About Me
I grew up in Guangzhou (Canton) and Hong Kong before moving to Washington, D.C. for high school.
I speak English, Cantonese, and Mandarin fluently, and a little bit of Hakka and Spanish.
I love piano and classical music. A lot. For piano, I mainly play Beethoven and Chopin, and sometimes Saint-Saëns and Mozart.
I love sports. Also a lot. I played varsity soccer and basketball in high school, and I follow all kinds of sports, from soccer and basketball to tennis, golf, etc. I am a fan of Borussia Dortmund💛🖤.
I also love movies, astronomy, food, etc.
To be continued…