In the world of High-Performance Computing (HPC), modern I/O applications are creating a significant surge in metadata processing. However, with multiple applications concurrently submitting substantial volumes of metadata operations, the shared resources of parallel file systems often become saturated, resulting in performance degradation and I/O unfairness.

In order to answer this question, researchers from the BigHPC project presented, at The 23rd IEEF/ACM International Symposium on Cluster, Cloud and Internet Computing (CCGRID), PADLL, an application and file system agnostic storage middleware that enables QoS control of data and metadata workflows in HPC storage systems.

The idea descends from Software-Defined Storage, envisioning data plane stages capable of mediating and rate-limiting POSIX requests submitted to the shared file system, and a control plane that holistically coordinates how all workflows are handled.

Through the paper “Taming Metadata-intensive HPC Jobs Through Dynamic, Application-agnostic QoS Control”, the researchers managed to prove the performance and feasibility of the solution under multiple QoS policies through synthetic benchmarks, real-world applications and collected traces of a production file system.

According to Ricardo Macedo, researcher from INESC TEC, “one significant contribution of the paper is the introduction of a new algorithm called PSFA (Proportional Sharing without False Resource Allocation). This algorithm addresses the challenge of achieving differentiated Quality of Service (QoS) across multiple jobs, while avoiding resource under- and over- provisioning in the presence of volatile workloads.” During the conference, Mariana Miranda, also from INESC TEC, participated in the Early Career and Students’ Showcase. She presented her ongoing Ph.D. work, which is closely aligned with the objectives of the BigHPC project. 

The 23rd IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing (CCGrid 2023) conference took place from May 1-4, 2023, in Bangalore, India. CCGrid is an annual conference that focuses on various aspects of distributed systems, including computing clusters, high-performance computing, cloud computing, and emerging Internet computing paradigms. 

It serves as a platform for researchers, practitioners, and industry professionals to share and discuss the latest advancements, research findings, and innovative ideas in these areas.