Sujan Kumar Gonugondla

Machine Learning Researcher  ยท  Meta FAIR-MSL

Sujan Kumar Gonugondla

A little about me

I am currently a Machine Learning Researcher at FAIR within Meta's Superintelligence Lab (MSL), where I work on pretraining research. Prior to this I worked on the Llama 4 Reasoning model. Before joining Meta, I led efficient inference efforts at Amazon for Amazon Q Developer.

I hold a Ph.D. from the University of Illinois at Urbana-Champaign, where my research focused on enabling efficient machine learning at the edge.

Selected Publications

2025
AdvancedIF: Rubric-Based Benchmarking and Reinforcement Learning for Advancing LLM Instruction Following

Yun He, Wenzhe Li, Hejia Zhang, Songlin Li, Karishma Mandyam, Sopan Khosla, et al.

ACL 2026 arXiv Meta AI
2024
The N-Grammys: Accelerating Autoregressive Inference with Learning-Free Batched Speculation

Lawrence Stewart, Matthew Trager, Sujan Kumar Gonugondla, Stefano Soatto

ENLSP @ NeurIPS 2024 arXiv
2024
Approximately Aligned Decoding

Daniel Melcer, Sujan Kumar Gonugondla, Pramuditha Perera, Haifeng Qian, Wen-Hao Chiang, Yanjun Wang, Nihal Jain, Pranav Garg, Xiaofei Ma, Anoop Deoras

ICML 2025 arXiv
2024
BASS: Batched Attention-optimized Speculative Sampling

Haifeng Qian*, Sujan Kumar Gonugondla*, Sungsoo Ha, Mingyue Shang, Sanjay Krishna Gouda, Ramesh Nallapati, Sudipta Sengupta, Xiaofei Ma, Anoop Deoras

* Equal contribution

ACL 2024 arXiv
2024
Token Alignment via Character Matching for Subword Completion

Ben Athiwaratkun, Shiqi Wang, Mingyue Shang, Yuchen Tian, Zijian Wang, Sujan Kumar Gonugondla, Sanjay Krishna Gouda, Robert Kwiatkowski, Ramesh Nallapati, Bing Xiang

ACL 2024 ACL
2024
Bifurcated Attention for Single-Context Large-Batch Sampling

Ben Athiwaratkun*, Sujan Kumar Gonugondla*, Sanjay Krishna Gouda, Haifeng Qian, Hantian Ding, Qing Sun, Jun Wang, Liangfu Chen, Jiacheng Guo, Parminder Bhatia, et al.

* Equal contribution

ICML 2024 arXiv
2023
Greener yet Powerful: Taming Large Code Generation Models with Quantization

Xiaokai Wei*, Sujan Kumar Gonugondla*, Wasi Ahmad, Shiqi Wang, Baishakhi Ray, Haifeng Qian, Xiaopeng Li, Varun Kumar, Zijian Wang, Yuchen Tian, et al.

* Equal contribution

EMNLP 2023 arXiv
2022
Multi-lingual Evaluation of Code Generation Models

Ben Athiwaratkun, Sanjay Krishna Gouda, Zijian Wang, Xiaopeng Li, Yuchen Tian, Ming Tan, Wasi Uddin Ahmad, Shiqi Wang, Qing Sun, Mingyue Shang, Sujan Kumar Gonugondla, et al.

ICLR 2023
2022
IMPQ: Reduced Complexity Neural Networks via Granular Precision Assignment

Sujan Kumar Gonugondla, Naresh R. Shanbhag

IEEE 2022 IEEE
2022
Fundamental Limits on Energy-Delay-Accuracy of In-Memory Architectures in Inference Applications

Sujan K. Gonugondla, Charbel Sakr, Hassan Dbouk, Naresh R. Shanbhag

IEEE 2022 arXiv

For the complete list, visit Google Scholar โ†—

In the News

2020
Celebrating Our Graduates: Sujan Gonugondla

Coordinated Science Laboratory · Allie Arp

2018
To Speed Up AI, Mix Memory and Processing

IEEE Spectrum · Katherine Bourzac

2018
7 Ideas for AI Silicon from ISSCC

EE Times · Rick Merritt

2016
Gonugondla Wins Best Paper Award at IEEE Conference

Coordinated Science Laboratory · August Schiess

Blog

Essays and notes on machine learning, systems, and research โ€” coming soon.