Start: July 2020
End: August 2022
Goal: This activity will implement the capability of monitoring a large-scale and heterogenous infrastructure for HPC and Big Data applications. The task will design and implement a novel distributed monitoring framework, based on a hierarchical architecture, that can efficiently monitor the resource usage of thousands of nodes without imposing significant overhead in the deployed applications. The solution will be suitable for software and hardware stacks typically found in both HPC and Big Data workloads and will provide real-time analysis and visualization about the cluster environment. Also, it will enable the storage of long-term monitoring information for historical analysis purposes that, when combined with real-time information, will be crucial for the SDS and Virtualization Managers to take the most appropriate management decisions.