Benchmarking for system tenders is a complex task. The I/O component is often neglected or filled with synthetic or artificial numbers. We explain what matters and demonstrate, through some measurements, how complex it is to define benchmarks that should measure asynchronous operations on shared resources.
In this topic, I would like to compare the I/O performance changes from the system that we benchmark using IO500. Additionally, we want to see the performance of ICON in the grand scheme of things within CLAIX since currently we are working with DKRZ in the Green HPC project
Current state-of-the-art and upcoming Earth System Model (ESM) simulations produce output on the order of single- to double digit petabytes per individual climatic timescale-spanning simulation. Creating an infrastructure environment enabling the purposeful analysis of such data amounts requires revamping data handling paradigms for ESM datasets. We present concepts, ideas and prototypes...
Early explorations into using an RDBMS as a data store for parallel IO workloads led to a conclusion that the technology was ill fitted for the task. The community has accepted this
“wisdom” and been reluctant to support any new efforts into investigating databases. I think it is time to revisit.
Abstract: Research has become increasingly data-driven, putting additional pressure on the underlying storage systems. Gaining insights into the their behavior is critical understanding and optimizing I/O performance. However, existing storage systems often lack the necessary functionality and are difficult to modify and extend. Therefore, the Parallel Computing and I/O research group is ...