Dirac MP2 parallelization


[Introduction][Structure and implementation][MPI structure][MPE toolkit]
[Profiling and results][Final implementation][Conclusions]

Introduction

Project

As part of NOTUR Advanced User Support, a project has been set up to parallelize the direct MP2 module of the relativistic quantum chemical program Dirac.

Use the menu at the top or bottom of the page to navigate these pages, there's also a slightly more printer-friendly version available (showing all sections on a single page).

Original implementation

Prior to this project, a simple parallel implementation for the MP2 module was already in place. The main aim of this implementation was to allow the calculation of MP2 energies of large systems, that otherwise would be impossible due to memory constraints. The performance of the original code for a small calculation on the Krypton atom is shown below.

MP2 calculation on Kr atom, run on magnum, 80 MW memory
# CPUsWall time (s)Total CPU time (s) Min slave CPU (s)Max slave CPU (s)
1 1257 1227 - -
2 1276 1325 1216 1216
3 1198 2033 796 1142
4 735 2049 623 668
5 1020 3058 564 921
6 940 3051 519 874
7 558 3067 494 501
8 1237 6264 832 1177
9 913 6340 475 830
10 886 6422 464 826
11 882 6395 454 810
12 883 6494 453 808
13 867 6428 436 786
14 870 6489 435 787
15 850 6517 430 777
16 528 6697 434 450

As can be seen from the table above, the parallel behaviour is less than ideal. The wall time is reduced only moderately, even when a large number of CPUs is used, whereas the total CPU time increases quite dramatically and in a stepwise manner. Another issue is the balancing of the load between the slaves. The load is well-balanced only for certain numbers of CPUs (n = 4, 7 and 16).

Aim

The two main goals for the new parallel implementation:


[Introduction][Structure and implementation][MPI structure][MPE toolkit]
[Profiling and results][Final implementation][Conclusions]

Last updated: September 20th 2003, by Vebjørn Bakken