Shivaram Venkataraman

Associate Professor, Computer Science, University of Wisconsin-Madison

Office: 7576 MH. Email: shivaram at cs.wisc.edu

Group

Research

Teaching

Publications

I am an Associate Professor in the Computer Science Department at University of Wisconsin, Madison. My research interests are in designing systems and algorithms for large scale data analysis and machine learning. Before coming to Madison, I was a post-doctoral researcher in the Systems Research Group at Microsoft Research in Redmond. Previously, I completed my PhD from UC Berkeley where I was advised by Ion Stoica and Mike Franklin. I also have a Masters from University of Illinois at Urbana-Champaign and worked in the Systems Research Group, with Prof. Roy Campbell.

News: I will be joining the Systems Group at ETH Zurich in Fall 2026. I am looking for PhD students and postdocs to join my group — please get in touch if you are interested!

Group

Rutwik Jain (co-advised with Matt Sinclair)
Brandon Tran (co-advised with Matt Sinclair)
Minghao Yan
Johannes Freischuetz
Tzu-Tao Chang
Fanchao Chen
Tareq Mahmood
Seth Ockerman

Alumni

PhD

Song Bian → NVIDIA Research Labs
Konstantinos Kanellis → AWS Learned Systems Group
Jason Mohoney → Post-doc at MIT
Saurabh Agarwal → Post-doc at UT-Austin

Post-doctoral Researchers

Pengfei Zheng (co-advised with Aditya Akella) → Huawei Technologies

MS

Devesh Sarda → Databricks
Aditi Singh → Nutanix
Mohil Patel → Oracle
Rachit Tibrewal
Olesia Elfimova → Dropbox
Adarsh Kumar → Amazon Alexa AI
Arjun Balasubramanian → Amazon AWS

BS

Wei Hao → Columbia
Yiheng Xu → Maryland
Yuhan Liu → UChicago
Ziyi Zhang → UChicago
Rui Pan → Princeton
Lynn Liu → UC Berkeley
Prasoon Sinha → UT Austin
Anze Xie → UCSD
Anders Carlsson → Amazon
Keting Chen → Cornell
Anyong Mao → USC

Current Research Areas

Improving LLM Inference: Reducing the cost of running large language models through inference-efficient model architectures, memory management, and speculative decoding.
GPU Variability & Power Management: Understanding how variability across GPUs affects cluster performance through variability-aware scheduling and high-fidelity GPU energy modeling.
Vector Search: New indexing and query methods to make vector similarity search faster and more scalable, including adaptive indexes and vector database deployment on HPC platforms.
Integrating ML into Systems: Using machine learning to improve core system components, including memory tiering and tuning unstable and noisy cloud applications.

Teaching

CS 537 Intro to OS: F24 S23 S20 S19

CS 744 Big Data Systems: S25 S24 F22 F21 F20 F19 F18

CS 839: Advanced Machine Learning Systems: S22

Selected Recent Publications

Fanchao Chen, Ziheng Jiang, Ziyun Wei, Zheng Zhong, Du Li, Chi Zhang, Haibin Lin, Shivaram Venkataraman Towards Full Pipeline FP8 Reinforcement Learning for LLMs - COLM 2026

Minghao Yan, Zhuang Wang, Zhen Jia, Shivaram Venkataraman, Yida Wang PLoRA: Efficient Concurrent LoRA Training for Large Language Models - ICML 2026

Rutwik Jain, Yiwei Jiang, Matt Sinclair, Shivaram Venkataraman Minos: Systematically Classifying Performance and Power Characteristics of GPU Workloads on HPC Clusters - ACM SIGMETRICS 2026

Brandon Tran, Matthias Maiterth, Woong Shin, Matt Sinclair, Shivaram Venkataraman Wattchmen: Watching the Wattchers – High Fidelity, Flexible GPU Energy Modeling - ICS 2026

Konstantinos Kanellis, Sujay Yadalam, Hayden Coffey, Shivaram Venkataraman, Michael Swift From Good to Great: Parameter Tuning in Memory Tiering Systems - IEEE Transactions on Computers 2026, Vol. 75, No. 4, April 2026

Song Bian, Tao Yu, Shivaram Venkataraman, Youngsuk Park Scaling Laws Meet Model Architecture: Toward Inference-Efficient LLMs - ICLR 2026

Saurabh Agarwal, Bodun Hu, Anyong Mao, Aditya Akella, Shivaram Venkataraman SYMPHONY: Enabling Compute-Memory Disaggregation in LLM Serving Systems - NSDI 2026

Jason Mohoney, Devesh Sarda, Mengze Tang, Shihabur Rahman Chowdhury, Anil Pacaci, Ihab F. Ilyas, Theodoros Rekatsinas, Shivaram Venkataraman Quake: Adaptive Indexing for Vector Search - OSDI 2025

Song Bian, Minghao Yan, Shivaram Venkataraman Scaling Inference-Efficient Language Models - ICML 2025

Tzu-Tao Chang, Shivaram Venkataraman LV-XAttn: Distributed Cross-Attention for Long Visual Inputs in Multimodal Large Language Models - ICML 2025

Minghao Yan, Saurabh Agarwal, Shivaram Venkataraman Decoding Speculative Decoding - NAACL 2025
*SAC Award for Generation*

Tzu-Tao Chang, Shivaram Venkataraman Eva: Cost-Efficient Cloud-Based Cluster Scheduling - Eurosys 2025

Johannes Freischuetz, Konstantinos Kanellis, Brian Kroth, Shivaram Venkataraman TUNA: Tuning Unstable and Noisy Cloud Applications - Eurosys 2025

Seth Ockerman, Amal Gueroudji, Tanwi Mallick, Yixuan He, Line Pouchard, Rob Ross, Shivaram Venkataraman PGT-I: Scaling Spatiotemporal GNNs with Memory-Efficient Distributed Training - Supercomputing 2025

Please see Google Scholar for a complete list.