[abinit-forum] band/fft parallel calculations truncating or

Total energy, geometry optimization, DFT+U, spin....

Moderator: bguster

Locked
User avatar
torrent
Posts: 127
Joined: Fri Aug 14, 2009 7:40 pm

[abinit-forum] band/fft parallel calculations truncating or

Post by torrent » Fri Apr 16, 2010 9:49 am

Following a topic posted on old forum....

Hi Eric,

Abinit v5.8.4p:
MPI-IO implementation has been completely rewritten from v5.9; the old one was correct in most cases but had a lack of transferability on several architectures.
And there is a problem with openMPI if you use a version before 1.4... (see the following).

Abinit v6.0.3:
An issue has been identified when using openMPI. Exactly the same symptom as yours (for some selected systems).
Abinit 6 can run successfully with mpich2 or with openMPI v1.4.1. OpenMPI v1.3 produces a "dead lock" during the writting.


Marc Torrent

=============================================================================
Hi,

I have been doing some calculations with using the parallel
abinit, specifically, I can't get my wavefunctions to written to disk
for some band and band/fft parallel jobs. This only occurs beyond a
certain system size. For instance, a 2x2x2 supercell of rutile writes
to disk correctly with both v5.8.4p and v6.0.3. Once I go to 2x2x3 or
larger, v5.8.4p exits with only a tiny fraction of the WFK file written
and v6.0.3 hangs after this same tiny fraction is written to disk.
(This happens when I run in a global parallel filespace (a PVFS directory)
or when I run in directories local to each node.)

I have tried two versions of the Intel compilers (11.1 and 10.1) and two
verisons of openmpi (1.3.2 and 1.2.5). Here is my configure statement:

./configure CC=mpicc FC=mpif90 --with-linalg-libs='-L/share/apps/intel/Compiler/11.1/069/mkl/lib/em64t/ -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -lguide -lpthread -lm' --with-scalapack-libs='-L/share/apps/intel/Compiler/11.1/069/mkl/lib/em64t/ -lmkl_scalapack' --enable-64bit-flags --enable-mpi --with-mpi-level=2 --enable-mpi-io

Here are the parallel parts of my input file:

paral_kgb 1
npfft 1
npband 16
npkpt 1
wfoptalg 14 # I have also tried '4'
fft_opt_lob 2
fftalg 401
accesswff 1


Has anyone else had this problem and found away around it?

Thanks,

Eric J. Walter
Department of Physics
College of William and Mary
Marc Torrent
CEA - Bruyères-le-Chatel
France

Locked