đź‘‹ Call me Joshua.

The Campus South Athletic Field
The Campus South Athletic Field, Chicago, IL
Quod
Quod Restaurant & Bar, 92-94 High St, Oxford, Oxfordshire, OX1 4BJ

🎓 I’m an incoming CS Ph.D. student at CSAIL, MIT EECS. I graduated from the University of Chicago with a Bachelor of Science in Mathematics (with Specialization in Economics) and Computer Science.

Research

👨‍💻 My research interests lie broadly in computer systems and machine learning. I build more efficient and reliable systems, both tailored to and powered by state-of-the-art machine learning algorithms, to improve performance, resource efficiency, affordability, and/or scalability. My recent focus is on ML/LLM inference and AI infrastructure.

✏️ During my undergrad years, I began research in math. Advised by Prof. Gregory Lawler and Jinwoo Sung, I worked on probability theory, particularly the loop-erased random walk (LERW), Loop Soups, Gaussian Free Field (GFF), and Schramm–Loewner Evolution (SLE). Later, I was fortunate to work with Prof. Junchen Jiang and Prof. Kexin Pei at UChicago, Prof. Ravi Netravali at Princeton, and Dr. Ganesh Ananthanarayanan at Microsoft Research on MLSys, with a focus on systems for efficient ML/LLM inference.

Recent Projects

I’m currently working on the following open-source projects as a member of LMCache Lab:

🚀LMCache: The first open-source Knowledge Delivery Network (KDN) that accelerates LLM applications up to 8x faster, at 8x lower cost

🚀vLLM Production Stack: Scale from a single vLLM instance to a distributed vLLM deployment without changing any application code

Past Projects

🚀Resource Allocation for Multi-Tenant Retrieval-Augmented Generation (RAG) Systems
Check it out here!

🚀KV Cache Compression and Streaming for Multimodal Large Language Models (MLLMs)

🚀Knowledge Streaming from LLMs to Environments
Check it out here!

Selected Publications

*: Equal Contribution.

  • Yuhan Liu, Yuyang Huang, Jiayi Yao, Zhuohan Gu, Kuntai Du, Hanchen Li, Yihua Cheng, Junchen Jiang, Shan Lu, Madan Musuvathi, Esha Choukse.
    DroidSpeak: KV Cache Sharing for Efficient Multi-LLM Serving
    Under review at a major systems conference [Paper]

  • Zhuohan Gu*, Jiayi Yao*, Kuntai Du, Junchen Jiang.
    LLMSteer: Improving Long-Context LLM Inference by Steering Attention on Reused Contexts
    NeurIPS 2024 workshop on Machine Learning for Systems [Paper / Poster]

  • Siddhant Ray, Rui Pan, Zhuohan Gu, Kuntai Du, Ganesh Ananthanarayanan, Ravi Netravali, Junchen Jiang.
    RAGServe: Fast Quality-Aware RAG Systems with Configuration Adaptation
    Under review at a major systems conference [Paper]

  • Siddhant Ray, Zhuohan Gu, Xi Jiang, Junchen Jiang, Nick Feamster.
    Transformer-based Predictions for Sudden Network Changes
    NSDI 2024 Poster Session [Poster]

All Publications

Expand
  • Yuhan Liu, Yuyang Huang, Jiayi Yao, Zhuohan Gu, Kuntai Du, Hanchen Li, Yihua Cheng, Junchen Jiang, Shan Lu, Madan Musuvathi, Esha Choukse.
    DroidSpeak: KV Cache Sharing for Efficient Multi-LLM Serving
    Under review at a major systems conference [Paper]
  • Zhuohan Gu*, Jiayi Yao*, Kuntai Du, Junchen Jiang.
    LLMSteer: Improving Long-Context LLM Inference by Steering Attention on Reused Contexts
    NeurIPS 2024 workshop on Machine Learning for Systems [Paper / Poster]
  • Siddhant Ray, Rui Pan, Zhuohan Gu, Kuntai Du, Ganesh Ananthanarayanan, Ravi Netravali, Junchen Jiang.
    RAGServe: Fast Quality-Aware RAG Systems with Configuration Adaptation
    Under review at a major systems conference [Paper]
  • Siddhant Ray, Zhuohan Gu, Xi Jiang, Junchen Jiang, Nick Feamster.
    Transformer-based Predictions for Sudden Network Changes
    NSDI 2024 Poster Session [Poster]
  • Zhuohan Gu, Dadu Chen.
    An Introduction to Loewner Energy
    UChicago Math REU 2024 [Paper]
  • Zhuohan Gu.
    A Study in Markov Chains, Loop-Erased Random Walk, and Loop Soups
    UChicago Math REU 2023 [Paper]

Selected Awards

  • Quad Undergraduate Research Conference Grant (supports faculty-mentored undergraduate participation in presenting papers at academic conferences), Chicago, IL, 10/2024
  • Jeff Metcalf Fellowship Grant (supports students’ career goals and offsets living expenses during internships), Chicago, IL, 05/2024
  • Honor Roll (maintained an overall academic average of 93% or above), Washington, D.C., US, 2018-20
  • The Bijali Dutta Ghosh Book Award (awarded for commitment to and ability in the Natural Sciences), Washington, D.C., US, 2020
  • Goldberg Science Award (for achievement in the sciences outside of school), Washington, D.C., US, 2020
  • First Place Award, American Mathematics Competition (AMC) 12, Washington, D.C., US, 2019

Education

  • B.S., University of Chicago, Mathematics and Computer Science, 2022-2025

More About Me

I grew up in Guangzhou (Canton) and Hong Kong before moving to Washington, D.C. for high school.
I speak English, Cantonese, and Mandarin fluently, and a little bit of Hakka and Spanish.
I love piano and classical music. A lot. For piano, I mainly play Beethoven and Chopin, and sometimes Saint-Saëns and Mozart.
I love sports. Also a lot. I played varsity soccer and basketball in high school, and I follow all kinds of sports, from soccer and basketball to tennis, golf, etc. I am a fan of Borussia Dortmund💛🖤.
I also love movies, astronomy, food, etc.

To be continued…