Calculation time degraded in a parallel case

HPC, IBM, Mac OS, Windows, ...

Moderators: jbeuken, gmatteo, Jordan

Forum rules
Please have a look at ~abinit/doc/config/ in the source package for detailed and up-to-date information about the configuration of Abinit builds.
For a video explanation on how to build Abinit for Linux, please go to:
IMPORTANT: when an answer solves your problem, please check the little green button on its upper-right corner to accept it.

Calculation time degraded in a parallel case

Postby Suguru » Wed Jul 05, 2017 5:16 am

Dear all,

I have been faced with a trouble in parallelization on ABINIT-8.4.2 when I install it in a new system. I could not find a similar topic and I would like to get a help here.

I am a beginner of ABINIT and first-principle packages and enjoying a simple calculation with ABINIT on my desktop PC. It works well even in parallel cases. In order to start a relatively large calculation, I am going to use a larger computing system (16 processes with 256 GB memories). For the installation, I followed the completely same procedures as I used for my desktop PC. A brief description of the setup is as follows:

CPU: Intel Xeon E5-2687W Sandy Bridge Octa Core 3.1GHz, L3 = 20MB 150W x 2
Compiler: ifort 13.0.1 with Intel MKL
MPI: OpenMPI-1.4.5 combined with torque-2.3.7
Configure option:
Code: Select all
../configure --enable-mpi --enable-openmp --enable-64bit-flags FC=mpif90 CC=mpicc CXX=mpicxx LDFLAGS="-L/opt/intel/composer_xe_2013_1.117/mkl/lib/intel64" LIBS="-lmkl_blas95_lp64 -lmkl_lapack95_lp64 -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -liomp5 -lpthread -lm" --prefix=$HOME/abinit-8.4.2

(I attach the config.log file for more details.)

There was no error when I did make, make check and make install. I also confirmed that simple ground-state calculations and band calculations provide consistent results with ones in my previous PC. The calculation time is also in the same order.

However, when I tried to perform parallel calculations, the calculation time significantly degraded. For example, a ground-state calculation on bismuth crystal (my interest) finished in 38 sec with "abinit<input.files>&log" but it takes more than 180 sec with "mpirun -np 8 abinit<input.files>&log". I believe that this is not due to the input file because a clear speed-up by the parallelization was observed on my desktop PC for the completely same input file. (ex. 38 sec -> 11 sec by "mpirun -np 8).

I have already confirmed the OpenMPI itself works well with a simple test program calculating Gram Schmidt normalization. Although I do not show the detail, the calculation time improves with a number of processes.

Although I am not sure whether this is related, I have found a strange behavior when I check the CPU consumption by "top" command. Even in a sequential trial, the CPU rate was around 1,600%. It may correspond to 16 * 100% and 16 is a maximum number of processes in this system. When I perform "mpirun -np 8", eight ~200% processes showed up. Such behavior has never been observed in my desktop PC.

Has anyone ever been faced with this kind of problem? I am sorry for this long post, but I would appreciate your advices.
(149.33 KiB) Downloaded 98 times
Posts: 1
Joined: Wed Jul 05, 2017 4:16 am

Return to Platform specific questions

Who is online

Users browsing this forum: Google [Bot] and 1 guest