Parallel IO NHR Workshop

R034 (Bundesstraße 45a)


Bundesstraße 45a

Anja Gerbes (TU Dresden), Anna Fuchs (Universität Hamburg), Jannek Squar, Panos Adamidis

Climate science is of great societal relevance. Resolving small-scale physical processes helps reduce uncertainties introduced by parameterisations, thus improving climate change projections. The objective is to compute a coupled atmosphere-ocean setup at a global resolution of 1km with a performance of 1 simulated year per day (SYPD). Such simulations require computational power available only on exascale supercomputer systems.

The output data is in the order of petabytes, and achieving the desired performance requires efficient I/O. The goal of this workshop is to highlight current and future methods for parallel I/O on large parallel file systems, spanning from the application level to the system level. Possible topics of interest include:

  • Lossless compression and chunking                                                      
  • Selection of appropriate data formats for I/O: HDF5, NetCDF, zarr, etc.
  • Optimization of I/O for climate models: application, middleware, file system
  • Post-processing of large data sets (reading large amounts of data)
  • Monitoring: effects of applications on the file system versus effects of the file system on the application
  • Comparison between file systems and object stores
  • Key metric: time-to-solution