Publications

Custom Label

2025

ASPLOS 2025
ImitationGame: Playing Tournaments for Tuning Applications in Noisy Cloud Environments
Rohan Basu Roy, Vijay Gadepally, Devesh Tiwari

2024

SoCC 2024
The Hidden Carbon Footprint of Serverless Computing
Rohan Basu Roy, Raghavendra Kanakagiri, Yankai Jiang, Devesh Tiwari

EMNLP 2024 [ Paper]
Sprout: Green Generative AI with Carbon-Efficient LLM Inference
Baolin Li, Yankai Jiang, Vijay Gadepally, Devesh Tiwari

HPEC 2024
LLM Inference Serving: Survey of Recent Advances and Opportunities
Baolin Li, Yankai Jiang, Vijay Gadepally, Devesh Tiwari

SC 2024 (a)
LexiQL: Quantum Natural Language Processing on NISQ-era Machines
Daniel Silver, Aditya Ranjan, Rakesh Achutha, Tirthak Patel, Devesh Tiwari

SC 2024 (b)
ECO-LIFE: High-Performance and Carbon-Aware Serverless Workloads Scheduling via Multi-generation Hardware
Yankai Jiang, Rohan Basu Roy, Baolin Li, Devesh Tiwari

SC 2024 (c)
Incentive-Based Power Efficiency Mechanisms on the Fugaku Supercomputer
Ana Luisa Veroneze Solorzano, Kento Sato, Keiji Yamamoto, Jim Brandt, Benjamin Schwaller, Sara Petra Walton, Jennifer Green, Fumiyoshi Shoji, Devesh Tiwari

SC 2024 (d)
Stellaris: Staleness-aware Distributed Reinforcement Learning with Serverless Computing
Hanfei Yu, Hao Wang, Jian Li, Seung-Jong Park, Devesh Tiwari

IPDPS 2024 [ Paper] [Artifact][Presentation]
Interpretable Analysis of Production GPU Clusters Monitoring Data via Association Rule Mining
Baolin Li, Siddharth Samsi, Vijay Gadepally, Devesh Tiwari

SIGMETRICS 2024 [ Paper] [Artifact]
StarShip: Mitigating I/O Bottlenecks in Serverless Computing for Scientific Workflows
Rohan Basu Roy, Devesh Tiwari

ASPLOS 2024 (a) [ Paper] [Artifact]
CodeCrunch: Improving Serverless Performance via Function Compression and Cost-Aware Warmup Location Optimization
Rohan Basu Roy, Tirthak Patel, Rohan Garg, Devesh Tiwari

ASPLOS 2024 (b) [ Paper] [Artifact]
RainbowCake: Mitigating Cold-starts in Serverless with Layer-wise Container Caching and Sharing
Hanfei Yu, Rohan Basu Roy, Christian Fontenot, Devesh Tiwari, Jian Li, Hong Zhang, Hao Wang, Seung-Jong Park

HotCarbon Workshop
Carbon in Motion: Characterizing Open-Sora on the Sustainability of Generative AI for Video Generation
Baolin Li, Yankai Jiang, Devesh Tiwari

FGCS
The Globus Compute Dataset: An Open Function-as-a-Service Dataset from the Edge to the Cloud
André Bauer, Haochen Pan, Ryan Chard, Yadu N. Babuji, Josh Bryan, Devesh Tiwari, Ian T. Foster, Kyle Chard

2023

SC 2023 (a) [ Paper] [Artifact]
GRAPHINE: Generating Application-Specific Neutral Atom Topologies for Improved Quantum Computing Performance
Tirthak Patel, Daniel Silver, Devesh Tiwari

SC 2023 (b) [ Paper] [Artifact][Presentation]
Clover: Toward Sustainable AI with Carbon-Aware Machine Learning Inference Service
Baolin Li, Siddharth Samsi, Vijay Gadepally, Devesh Tiwari

SC 2023 (c)
Comprehensive Experimental Evaluation and Analysis of a Universal Photonic Quantum Computer
Aditya Ranjan, Tirthak Patel, Harshitta Gandhi, Daniel Silver, William Cutler, Devesh Tiwari

SC 2023 (d)
Sustainable HPC: Modeling, Characterization, and Implications of Carbon Footprint in Modern HPC Systems
Baolin Li, Rohan Basu Roy, Daniel Wong, Sid Samsi, Vijay Gadepally, Devesh Tiwari.

AAAI 2023
SLIQ: Resource-Efficient Quantum Similarity Networks for Unlabeled Data on Noisy Quantum Computers
Daniel Silver, Tirthak Patel, Aditya Ranjan, Harshitta Gandhi, William Cutler, Devesh Tiwari

HPDC 2023 (a)
Kairos: Building Cost-Efficient Machine Learning Inference Systems with Heterogeneous Cloud Resources
Baolin Li, Sid Samsi, Vijay Gadepally, Devesh Tiwari

HPDC 2023 (b)
ProPack: Executing Concurrent Serverless Functions Faster and Cheaper
Rohan Basu Roy, Tirthak Patel, Richmond Liew, Yadu Nand Babuji, Ryan Chard, Devesh Tiwari

ECCV 2023
MosaiQ: Enabling High-Quality Image Generation on Quantum Computers
Daniel Silver, Tirthak Patel, William Cutler,Aditya Ranjan, Harshitta Gandhi, Devesh Tiwari

SoCC 2023
Sustainable Supercomputing for AI: GPU Power Capping at HPC Scale
Dan Zhao, Siddharth Samsi, Joseph McDonald, Baolin Li, David Bestor, Michael Jones, Devesh Tiwari, Vijay Gadepally

MICRO 2023
SupeRBNN: Randomized Binary Neural Network Using Adiabatic Superconductor Josephson Devices
Zhengang Li, Geng Yuan, Tomoharu Yamauchi, Masoud Zabihi, Yanyue Xie, Peiyan Dong, Xulong Tang, Nobuyuki Yoshikawa, Devesh Tiwari, Yanzhi Wang, Olivia Chen

DAC 2023
Invited: Building Robust Quantum System Software for Technology-Specific Characteristics
Tirthak Patel, Devesh Tiwari

HPEC 2023
From Words to Watts: Benchmarking the Energy Costs of Large Language Model Inference
Siddharth Samsi, Dan Zhao, Joseph McDonald, Baolin Li, Adam Michaleas, Michael Jones, William Bergeron, Jeremy Kepner, Devesh Tiwari, Vijay Gadepally.

ArXiv
Toward Privacy in Quantum Program Execution On Untrusted Quantum Cloud Computing Machines for Business-sensitive Quantum Needs
Tirthak Patel, Daniel Silver, Aditya Ranjan, Harshitta Gandhi, William Cutler, Devesh Tiwari

2022

SC 2022 (a)
DayDream: Executing Dynamic Scientific Workflows on Serverless Platforms with Hot Starts

SC 2022 (b)
CHARTER: Identifying the Most-Critical Gate Operations in Quantum Circuits via Amplified Gate Reversibility

AAAI 2022
QUILT: Effective Multi-Class Classification on Quantum Computers Using an Ensemble of Diverse Quantum Classifiers

ASPLOS 2022 (a)
IceBreaker: warming serverless functions better with heterogeneity

ASPLOS 2022 (b)
QUEST: systematically approximating Quantum circuits for higher output fidelity

ISCA 2022
Geyser: a compilation framework for quantum computing with neutral atoms

HPCA 2022
AI-Enabling Workloads on Large-Scale GPU-Accelerated System: Characterization, Opportunities, and Implications

PPoPP 2022
Mashup: making serverless computing useful for HPC workflows via hybrid execution

SOCC 2022
MISO: Exploiting Multi-Instance GPU Capability on Multi-Tenant GPU Clusters

NAACL 2022
Great Power, Great Responsibility: Recommendations for Reducing Energy for Training Language Models

DATE 2022 (a)
OPTIC: A Practical Quantum Binary Classifier for Near-Term Quantum Computers

DATE 2022 (b)
Do Temperature and Humidity Exposures Hurt or Benefit Your SSDs?

2021

SC 2021 (a)
Systematically Inferring I/O Performance Variability by Examining Repetitive Job Behavior

SC 2021 (b)
Ribbon: Cost-Effective and QoS-Aware Deep Learning Model Inference using a Diverse Pool of Cloud Computing Instances

ISCA 2021
SATORI: Efficient and Fair Resource Partitioning by Sacrificing Short-Term Benefits for Long-Term Gains

HPCA 2021
Operating Liquid-Cooled Large-Scale Systems: Long-Term Monitoring, Reliability Analysis, and Efficiency Measures

DSN 2021
Examining Failures and Repairs on Supercomputers with Multi-GPU Compute Nodes

PLDI 2021
BLISS: Auto-tuning Complex Applications Using A Pool of Diverse Lightweight Learning Models

IISWC 2021
Serverless Storage Scalability Challenges: Characterization, Implications, and Mitigation

HPEC 2021
Serving Machine Learning Inference Using Heterogeneous Hardware

ASPLOS 2021
QRAFT: Reverse Your Quantum Circuit and Know the Correct Program Output

2020

USENIX ATC 2020
UREQA: Leveraging Operation-Aware Error Rates for Effective Quantum Circuit Mapping on NISQ-Era Quantum Computers

USENIX FAST 2020 (a)
GIFT: A Coupon Based Throttle-and-Reward Mechanism for Fair and Efficient I/O Bandwidth Management on Parallel Storage Systems

USENIX FAST 2020 (b)
Uncovering Access, Reuse, and Sharing Characteristics of I/O-Intensive Files on Large-Scale Production HPC Systems

USENIX FAST 2020 (c)
Making Disk Failure Predictions SMARTer!

SC 2020 (a)
VERITAS: Accurately Estimating the Correct Output on Noisy Intermediate-Scale Quantum Computers

SC 2020 (b)
Experimental Evaluation of NISQ Quantum Computers: Error Measurement, Characterization, and Implications

SC 2020 (c)
Job Characteristics on Large-Scale Systems: Long-Term Analysis, Quantification and Implications

ICCAD 2020
DisQ: A Novel Quantum Output State Classification Method on IBM Quantum Computers using OpenPulse

HPCA 2020
CLITE: Efficient and QoS-Aware Co-Location of Multiple Latency-Critical Jobs for Warehouse Scale Computers

IPDPS 2020
What does the Power Consumption Behavior of HPC Jobs Reveal?

JSNAM 2020
Resilience and Coevolution of Preferential Interdependent Networks

JMR 2020
Comparing Performances of Five Distinct Automatic Classifiers for Fin Whale Vocalizations in Beamformed Spectrograms of Coherent Hydrophone Array

TDSC 2020
Characterizing and Exploiting Soft Error Vulnerability Phase Behavior in GPU Applications

2019

TPDS 2019
An Analysis Workflow-Aware Storage System for Multi-Core Active Flash Arrays

SC 2019
Revisiting I/O Behavior in Large-Scale Storage Systems: The Expected and the Unexpected

HPDC 2019
PERQ: Fair and Efficient Power Management of Power-Constrained Large-Scale Computing Systems

DAC 2019
What Does Vibration Do To Your SSD?

CLOUD 2019
Exploring Potential for Non-Disruptive Vertical Auto Scaling and Resource Estimation in Kubernetes

ICAC 2019
Characterizing Disk Health Degradation and Proactively Protecting Against Disk Failures for Reliable Storage Systems

CCGrid 2019
Towards Enabling Dynamic Resource Estimation and Correction for Improving Utilization in an Apache Mesos Cloud Environment

DATE 2019
PCFI: Program Counter Guided Fault Injection for Accelerating GPU Reliability Assessment

2018

BIGDATA 2018
Reliability Characterization of Solid State Drives in a Scalable Production Datacenter

ASONAM 2018
Resilience and the Coevolution of Interdependent Multiplex Networks

ICCCN 2018
Exploring the Optimal Platform Configuration for Power-Constrained HPC Workflows

DSN 2018 (a)
Shiraz: Exploiting System Reliability and Application Resilience Characteristics to Improve Large Scale System Throughput

DSN 2018 (b)
Machine Learning Models for GPU Error Prediction in a Large Scale HPC System

DSN 2018 (c)
Understanding and Analyzing Interconnect Errors and Network Congestion on a Large Scale HPC System

2017

SC 2017 (a)
Failures in Large Scale Systems: Long-Term Measurement, Analysis, and Implications

SC 2017 (b)
GUIDE: A Scalable Information Directory Service to Collect, Federate, and Analyze Logs for Operational Insights into a Leadership HPC Facility

MASCOTS 2017 (a)
Toward Managing HPC Burst Buffers Effectively: Draining Strategy to Regulate Bursty I/O Behavior

MASCOTS 2017 (b)
Characterizing Temperature, Power, and Soft-error Behaviors in Data Center Systems: Insights, Challenges, and Opportunities

CLUSTER 2017
Effective Running of End-to-end HPC Workflows on Emerging Heterogeneous Architectures

MWSCAS 2017
Combining Architectural Fault-injection and Neutron Beam Testing Approaches Toward Better Understanding of GPU Soft-error Resilience

TECS 2017
Compiler-directed Soft Error Detection and Recovery to Avoid DUE and SDC via Tail-DMR

TOMPECS 2017
Obtaining and Managing Answer Quality for Online Data-intensive Services

2016

SC 2016 (a)
Granularity and the Cost of Error Recovery in Resilient AMR Scientific Applications

SC 2016 (b)
Compiler Directed Lightweight, Fine-grained, Guaranteed Recovery for Soft Error Resilience. (Best Student Paper Award Finalist)

MICRO 2016
Low-Cost Soft Error Resilience with Unified Data Verification and Fine-Grained Recovery for Acoustic Sensor Based Detection

ICAC 2016
Adaptive Power Profiling for Many-Core HPC Architectures

DSN 2016
Power-aware Checkpointing: Toward the Optimal Checkpointing Interval under Power Capping

IPDPS 2016
Reducing Waste in Large Scale Systems Through Introspective Analysis

HPCA 2016
A Large-Scale Study of Soft-Errors on GPUs in the Field

2015

SC 2015 (a)
Reliability Lessons Learned From GPU Experience With The Titan Supercomputer at Oak Ridge Leadership Computing Facility

SC 2015 (b)
A Practical Approach to Reconciling Availability, Performance, and Capacity in Provisioning Extreme-scale Storage Systems

SC 2015 (c)
AnalyzeThis: An Analysis Workflow-Aware Storage System

SC 2015 (d)
Node Variability in Large-Scale Power Measurements: Perspectives from the Green500, Top500 and EEHPCWG

ICAC 2015
Ubora: Measuring and Managing Answer Quality for Online Data-Intensive Services

DSN 2015
Understanding and Exploiting Spatial Properties of System Failures on Extreme-Scale HPC Systems

LCTES 2015
Clover: Compiler Directed Lightweight Soft Error Resilience

HPCA 2015
Understanding GPU Errors on Large-scale HPC Systems and the Implications for System Design and Operation

CUG 2015
Experience with GPUs on the Titan Supercomputer from a Reliability, Performance and Power Perspective

JPDC 2015
Application Configuration Predication for Energy-Efficient Execution on Multicore Systems

2014

SC 2014
Best Practices and Lessons Learned from Deploying and Operating Large-Scale Data-Centric Parallel File Systems

DSN 2014
Lazy Checkpointing: Exploiting Temporal Locality in Failures to Mitigate Checkpointing Overheads on Extreme-Scale Systems

IPDPS 2014
MapReuse: Reusing Computation in an In-Memory MapReduce System

CUG 2014
I/O Router Placement and Fine-Grained Routing on Titan to Support Spider II

ICPADS 2014
Improving Large-scale Storage System Performance via Topology-aware and Balanced Data Placement

LUG 2014
SSD Provisioning for Exascale Storage System: When, Where and How much?

2013 and before

FAST 2013
Active Flash: Towards Energy-Efficient, In-Situ Data Analytics on Extreme-Scale Machine

HotPower 2012
Reducing Data Movement Cost using Energy-Efficient Active Computation on SSD

IPDPS 2012
Modeling and Analyzing Key Performance Factors of Shared Memory Map Reduce

ISPASS 2012
Architectural Characterization and Similarity Analysis of Sunspider and Google’s V8 Javascript Benchmarks

HPCA 2011
HAQu: Hardware Accelerated Queueing for Fine-Grained Threading on a Chip Multi-Processor

IPDPS 2010
MMT: Exploiting Fine-Grained Parallelism in Dynamic Memory Management

MEDEA Workshop PACT 2009
Memory Management Thread for Heap Intensive Sequential Applications

Wild and Crazy Idea Session 2009
Explicit Sequential Programming for Implicit Parallel Performance on Many Cores

Ceramics International 2009
Simulation of Thermal and Electric Field Evolution during Spark Plasma Sintering

Ceramics International 2009
Is Weibull distribution the most appropriate statistical strength distribution for brittle materials?