Container Orchestration on HPC Platforms

The last decade witnessed a new era of software development that allows software developers to write applications independently of the target environment by packaging them along with their dependencies and environment variables inside containers. Numerous studies [1-2] have shown that containers are optimal for building and running applications reliably on Read more…

Monarch system presented at CCGrid 2022

‘Accelerating Deep Learning Training Through Transparent Storage Tiering’ is the title of the new paper presented at the 22nd IEEE International Symposium on Cluster, Cloud and Internet Computing (CCGrid 22). This paper was made by Marco Dantas, Ricardo Macedo, and João Paulo (from INESC TEC), Peter Cui, and Weija Xu Read more…

Maintaining Storage Health over Time

Storage is a crucial aspect of HPC clusters. Storage nodes are typically shared among multiple users, and are highly utilized – there isn’t a lot of unused space. The storage nodes also experience a lot of reads and writes – data is typically written, deleted, and written again multiple times. Read more…