Performance benchmarks showing scalability to 100,000 cores and more have been achieved but such scalability may be severely limited if large-scale output of results is included. This is clearly an important issue for productive scientific work at this scale. Portability between machines using different parallel filesystems and I/O methods can also be an issue.

In this work package, we worked on optimising I/O performance of applications on parallel file systems by leveraging flexible I/O middleware libraries and/or tuning of filesystem specific parameters. Common problems with parallel I/O are internal contention and external contention. External contention, caused by I/O from other users on shared resources, cannot be avoided. Internal contention, caused by suboptimal use of resources (network, locking, disks) should be avoided to achieve scalable parallel I/O.

Work was also done on leveraging I/O middleware libraries for flexible and efficient use of I/O methods (e.g. POSIX-IO or MPI-IO; to a single shared file or multiple shared files, or unique file per rank) on parallel filesystems. Also, optimising applications for parallel filesystems (GPFS, Lustre, PanFS) by implementing filesystem specific functions.

We have assessed two well-established parallel libraries

  • ADIOS (Adaptable IO System) from Oak Ridge National Laboratory offers a flexible method for parallel I/O by providing generic read/write calls combined with a user configurable XML file to switch the underlying APIs (POSIX, MPI-IO, HDF5, etc.; also I/O staging, aggregation and compression layers available).
  • SIONlib (Scalable parallel I/O) from Forschungszentrum Jülich is a library for handling task-local I/O efficiently.

A session on this work was presented at a recent ARCHER/PRACE I/O training course.

  • Parallel I/O Middleware, V. Szeremi, ARCHER/PRACE I/O training course on 2nd-3rd Sept. 2014 at STFC Daresbury Laboratory (slides).