Mr. Papon Choonhaklai presented his research at SWoPP 2025

Mr. Papon Choonhaklai from our laboratory presented the following paper at Summer United Workshops on Parallel, Distributed and Cooperative Processing (SWoPP 2025), held in Japan from August 4 to 6, 2025:

Papon Choonhaklai, Kohei Ichikawa, Hajimu Iida: A Metric-Driven Kubernetes Operator for Dynamic MPS-Based GPU Sharing in Inference Workloads, IPSJ SIG Technical Report, Vol. 2025-HPC-200, No. 23, pp. 1-8, Jul. 2025.

SWoPP is a domestic conference that brings together multiple workshops in the field of parallel, distributed, and cooperative processing.

In this study, we proposed a metric-driven Kubernetes operator that enables efficient GPU sharing using NVIDIA’s Multi-Process Service (MPS). Unlike conventional methods relying on static partitions or custom resource definitions (CRDs), our operator leverages runtime metrics collected from the DCGM exporter to dynamically schedule inference workloads.

Empirical evaluation on LLM inference workloads demonstrated that our approach achieved 89% GPU utilization, 60% memory utilization, the highest throughput (8.4 req/s), and the lowest average response time (34,183 ms), significantly outperforming NVIDIA time-slicing and standalone NOS methods

Kohei Ichikawa
Kohei Ichikawa
Visiting Professor
Papon Choonhaklai (Can)
Papon Choonhaklai (Can)
Master’s Student

Master’s student at Information Science, Nara Institute of Science and Technology, Japan