Files In Memory: New RAM based I/O Layer in MOLCAS

Files In Memory: New RAM based I/O Layer in MOLCAS
Files In Memory: New RAM based I/O Layer in MOLCAS
Victor P. Vysotskiy, Valera Veryazov
Theoretical Chemistry, Chemical Center, P.O.B. 124, Lund 22100, Sweden, E-mail: Victor.Vysotskiy@teokem.lu.se
• CASPT2 I/O
• Files In Memory
• Benchmarks
Multiconfigurational second-order perturbation method CASPT2 is known as a reliable computational tool for the electronic structure
calculations. The original CASPT2 code in MOLCAS has been developed in the beginning of 90s [1].The main development of the code
focused on algorithmic improvements, for example, recent development allows to use RASSCF reference wavefunction [2], and the Cholesky
Decomposition method [3]. The change of hardware architecture was addressed much less. It is well know fact that a speed of a typical CASPT2
calculation is limited by storing and reading data, i.e. it is I/O-bound problem. Among CASPT2 scratch files, only the two-electron integrals or
Cholesky vectors files are read sequentially for several times, while the rest files are accessed constantly and randomly. In other words, the
CASPT2 I/O workload is dominated by random write and read operations, which is the worst case scenario for conventional HDD, due to the
excessively high latency of spinning hard disks. One may expect that a caching mechanism of a underlying filesystem (FS) should improve the
overall I/O performanceas long as there is no large files and available memory is sufficient for buffering all needed data. However, the caching
mechanism is not selective in a sense that it tries to buffer all opened/accessed files simultaneously/uniformly, regardless their sizes and I/O
access patterns.Generally speaking, without any assumptions about certain FS and its caching mechanism, the best possible performance of
the CASPT2 module can be obtained only by using an electronic data storage device with the lowest available latency and the best random I/O
performance like, e.g.,Random Access Memory (RAM), or Solid State Device (SSD) .
Although nowadays most of the computers are equipped with large amount of memory, neither
CASPT2 code itself, or operating system by caching I/O, can use this memory in efficient way. In
order to utilize RAM directly for I/O we have developed a new framework called as “Files in
Memory” (FiM). The key idea of FiM is to keep a scratch file in RAM entirely instead of using a
HDD/SSD disk. In sharp contrast to FS caching, within FiM one has an explicit and transparent
control on a housing data in RAM. By design, FiM is capable to place data in Sys V shared
memory segments and thus can be shared between several different MPI processes running on
the same node at no extra message passing cost.
Unlike to the memory-resident I/O layer of CRAY FFIO [4], FiM is a general framework and
I/O Operations:
can be used on any POSIX compliant operation system such as Linux, AIX, Windows, So(read,write)=memcpy laris.
The beauty of FiM that it is easy to use for both MOLCAS end user and developer: there is no
~instant seek time
need to change source code, one just needs to edit an external resource file! In addition, FiM provides environment variables that control the execution of MOLCAS and automatic (dynamical)
switching between I/O layers at runtime.
 Hardware configuration:
- 2-way Intel Xeon CPU E5630 (2.53GHz);
- 48 Gb of DDR3 (1066MHz) RAM;
- 2 Intel SSDSC2MH250A2 250GB are attached to the RocketRaid 62x SATA RAID 6Gb/s
Controller (RAID1);
- 1 HDD WDC WD10EURS-630AB1 SATA II
C9H7NO4 (A)
C21H32N (D)
1000 Gb.
 The ext3 FS was installed on all storage devices and disks were mounted with the “noatime”
option. The Lustre FS was tuned within “lfs -c 1 -s
 For I/O benchmarking were selected several typical
1m” command.
CASPT2/RASPT2 jobs. In addition, the benchmark set
was extended by adding one MCLR test.
 The “NO FS_CACHING” results were obtained
by using only 4Gb of RAM (the rest 44 Gb of RAM
physically
removed
benchmarking).
 were
The ext3
FS was installed
on prior
all storage
devices and
C10H14O4S (B)
their was mounted with the “noatime” option. The Lustre
FS
was
tuned
“lfs -c 1 -s 1m”.
command.
For
I/Owithin
benchmarking
were
selected several
C16H12N2 (C)
typical CASPT2/RASPT2 jobs. In addition, the
benchmark set was extended by adding one
 MCLR
FiM provides
test. the best performance;
 CASPT2 : SSD outperforms HDD ~1.1-1.6x;
MCLR: SSD outperforms HDD >10x;
• Results
 FiM
over
Lustre the
FS best
provides
virtually
 FiM
provides
I/O performance;
the same performance as a local HDD;
 CASPT2 : SSD outperforms HDD ~1.1-1.6x;
 Within
FiM it SSD
is now
possible toHDD
run MOLCAS
MCLR:
outperforms
>10x;
on a diskless HPC node/workstation without
performance
penalty;
 In the case
of HDD, FS Caching remarkably
improves I/O throughput speed;
 FiM is useful and powerful tool for data analysis,
debugging.
 FiM over Lustre FS provides virtually the
same performance as a local HDD;
 Within FiM it is now possible to run MOLCAS
on a diskless HPC node/workstation without
performance penalty;
Relative CASPT2/MCLR Timings (%)
 FS Caching remarkably improves I/O
throughput speed by factor of 2;
FiM
SSD (FS_CACHING)
HDD (FS_CACHING)
SSD (NO FS_CACHING)
HDD (NO FS_CACHING)
Lustre PFS
http://www.molcas.org
 FiM can help make more efficient use of the
shared memory on SMP nodes, thus mitigating
the need for explicit intra-node communication.
• References
[1] K. Andersson, P.-Å. Malmqvist, B. O. Roos, A. J. Sadlej, K. Wolinski, J. Phys. Chem. 94, 5483-5488 (1990).
[2] P.-Å. Malmqvist, K. Pierloot, A. R. Moughal Shahi, C. J. Cramer, L. Gagliardi, J.Chem. Phys. 128, 204109(1-10) (2008).
[3] F. Aquilante, L. D. Vico, N. Ferré, G. Ghigo, P.-Å. Malmqvist, P. Neogrády, T. B. Pedersen, M. Pitoňák, M. Reiher, B. O. Roos, L. Serrano-Andrés, M. Urban, V. Veryazov,
R. Lindh, J. Comput. Chem. 31, 224-247 (2010).
[4] Cray T3ETM Fortran Optimization Guide - 004-2518-002, Chapter 5. Input/Output.
http://www.molcas.org
Was this manual useful for you? yes no
Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Download PDF

advertising