The Abinit 8.0.8 I use is compiled with IntelMPI (ifort 15.0) and it passed all of the test suites.
The supercell is generated using Phonopy program with the intention to do finite difference calculations. The previous few supercells can be run without any problem.
The errors produced are related MPI_Abort, such as
Code: Select all
application called MPI_Abort(MPI_COMM_WORLD, 13) - process 63
application called MPI_Abort(MPI_COMM_WORLD, 13) - process 68
application called MPI_Abort(MPI_COMM_WORLD, 13) - process 71
application called MPI_Abort(MPI_COMM_WORLD, 13) - process 62
application called MPI_Abort(MPI_COMM_WORLD, 13) - process 66
application called MPI_Abort(MPI_COMM_WORLD, 13) - process 67
application called MPI_Abort(MPI_COMM_WORLD, 13) - process 60
application called MPI_Abort(MPI_COMM_WORLD, 13) - process 65
application called MPI_Abort(MPI_COMM_WORLD, 13) - process 69
application called MPI_Abort(MPI_COMM_WORLD, 13) - process 70
application called MPI_Abort(MPI_COMM_WORLD, 13) - process 61
INTERNAL ERROR: invalid error code 78ea36 (Ring ids do not match) in MPIR_Allreduce_impl:1262
INTERNAL ERROR: invalid error code 58ea36 (Ring ids do not match) in MPIR_Allreduce_impl:1262
INTERNAL ERROR: invalid error code 58ea36 (Ring ids do not match) in MPIR_Allreduce_impl:1262
INTERNAL ERROR: invalid error code 68ea36 (Ring ids do not match) in MPIR_Allreduce_impl:1262
Fatal error in MPI_Allreduce: Other MPI error, error stack:
MPI_Allreduce(1421)......: MPI_Allreduce(sbuf=0xf2a2ea0, rbuf=0xf6941c0, count=516706, MPI_DOUBLE_PRECISION, MPI_SUM, comm=0x84000004) failed
MPIR_Allreduce_impl(1262):
The errors only appear when I activate KGB parallelization using a certain distribution of processors. For example the distribution cause errors
Code: Select all
paral_kgb 1 npkpt 7 npband 12 npfft 1
but the following processor distribution runs without problem
Code: Select all
paral_kgb 1 npkpt 14 npband 4 npfft 1
All other parameters are still the same.
The log file and inout files are attached.
Thank you.