parallel configuration

option, parallelism,...

Moderators: fgoudreault, mcote

Forum rules
Please have a look at ~abinit/doc/config/build-config.ac in the source package for detailed and up-to-date information about the configuration of Abinit 8 builds.
For a video explanation on how to build Abinit 7.x for Linux, please go to: http://www.youtube.com/watch?v=DppLQ-KQA68.
IMPORTANT: when an answer solves your problem, please check the little green V-like button on its upper-right corner to accept it.
Locked
Nadia
Posts: 2
Joined: Tue Oct 04, 2016 5:23 am

parallel configuration

Post by Nadia » Tue Oct 04, 2016 6:41 am

Hello dear ABINIT users;
I compiled succesfully the 7.10.4 parallel ABINIT version on a clusieur :32 nodes Bi-CPU Intel Xeon X5670 - 2x 6 cores @ 2.93 GHz - 24 Go.
I am trying a parallel geometrical optimization calculation of LafeO3 orthorombic phase. The calculation stops after few ntime steps. I need assistance to resolve this problem. Here is the error messsage at the end of log file :
----------------------------------------------------------------------------------
At line 808 of file mover.F90
Fortran runtime error: End of file

mpirun has exited due to process rank 2 with PID 18979 on
node farabi17 exiting improperly. There are two reasons this could occur:

1. this process did not call "init" before exiting, but others in
the job did. This can cause a job to hang indefinitely while it waits
for all processes to call "init". By rule, if one process calls "init",
then ALL processes must call "init" prior to termination.

2. this process called "init", but exited without calling "finalize".
By rule, all processes that call "init" MUST call "finalize" prior to
exiting or it will be considered an "abnormal termination"

This may have caused other processes in the application to be
terminated by signals sent by mpirun (as reported here).
-------------------------------------------------------------------------------------------------------
and here is the submitted batch file:
------------------------------------------------------------------------------------------------
#!/bin/sh
#SBATCH --partition=materiaux
#SBATCH -A materiaux
#SBATCH --nodes=2
#SBATCH --tasks-per-node=8
#SBATCH --mail-type=ALL # Réception d'un mail à la fin du job
#SBATCH --output=log-%j.out # Fichier de sortie du programme
#SBATCH --error=log-%j.err # Fichier d'erreur du programme
#SBATCH --mail-user=n_ilesdz@yahoo.fr

#module load abinit/7.10.4

mpirun abinit < lafeo3.files >& lafeo3.log
---------------------------------------------------------------------------------------------
and finally, the input file LaFeO3 :
---------------------------------------------------------------------------------------------
# lafeo3 orthorombique
# optimisation géométrique

spgroup 62

kptopt 1 # Option for the automatic generation of k points, taking
# into account the symmetry

nsppol 2
spinat 0. 0. 0.0
0. 0. 0.0
0. 0. 0.0
0. 0. 0.0
0. 0. 7.0
0. 0. 7.0
0. 0. -7.0
0. 0. -7.0
0. 0. 0
0. 0. 0
0. 0. 0
0. 0. 0
0. 0. 0
0. 0. 0
0. 0. 0.0
0. 0. 0.0
0.0 0.0 0.0
0.0 0.0 0.0
0.0 0.0 0.0
0.0 0.0 0.0
nspden 2

acell 1.02902893502723E+01 1.45257208767473E+01 1.02742096569928E+01
angdeg 90 90 90

nsym 0
tolsym 1.e-4
optcell 1
ionmov 2
ntime 30

ntypat 3 # There is only one type of atom
znucl 57 26 8 # The keyword "znucl" refers to the atomic number of the

nband 92
occopt 1
#Definition of the atoms
natom 20 # There are two atoms
typat 1 1 1 1 2 2 2 2 3 3 3 3 3 3 3 3 3 3 3 3

xred
2.54991329693666E-02 2.50000000000000E-01 9.94305905930338E-01
9.74500867030633E-01 7.50000000000000E-01 5.69409406966170E-03
5.25499132969367E-01 2.50000000000000E-01 5.05694094069662E-01
4.74500867030633E-01 7.50000000000000E-01 4.94305905930338E-01
0.00000000000000E+00 0.00000000000000E+00 5.00000000000000E-01
5.00000000000000E-01 0.00000000000000E+00 0.00000000000000E+00
5.00000000000000E-01 5.00000000000000E-01 0.00000000000000E+00
0.00000000000000E+00 5.00000000000000E-01 5.00000000000000E-01
4.92535721476408E-01 2.50000000000000E-01 6.66590204457015E-02
5.07464278523592E-01 7.50000000000000E-01 9.33340979554299E-01
9.92535721476408E-01 2.50000000000000E-01 4.33340979554299E-01
7.46427852359163E-03 7.50000000000000E-01 5.66659020445701E-01
2.24379670389852E-01 5.36034348079099E-01 2.24274009664965E-01
7.75620329610148E-01 4.63965651920901E-01 7.75725990335035E-01
7.24379670389852E-01 5.36034348079099E-01 2.75725990335035E-01
2.75620329610148E-01 4.63965651920901E-01 7.24274009664965E-01
2.75620329610148E-01 3.60343480790995E-02 7.24274009664965E-01
7.24379670389852E-01 9.63965651920901E-01 2.75725990335035E-01
7.75620329610148E-01 3.60343480790995E-02 7.75725990335035E-01
2.24379670389852E-01 9.63965651920901E-01 2.24274009664965E-01
#Definition of the planewave basis set
ecut 45
pawecutdg 90
ecutsm 0.5
pawovlp 0


#Definition of the SCF procedure
nstep 40 # Maximal number of SCF cycles

diemac 14 # Although this is not mandatory, it is worth to
diemix 0.5d0 # precondition the SCF cycle. The model dielectric
# function used as the standard preconditioner
# is described in the "dielng" input variable section.
toldff 5.0d-6
tolmxf 5.0d-6

ixc 11
# add to conserve old < 6.7.2 behavior for calculating forces at each SCF step
optforces 1
--------------------------------------------------------------------------------------------------------------------------
It should be noticed that sequential calculation terminated successfully, It should be perhaps due to MPI communication ??? I am looking for your assistance to resolve this problem.
Respectfully
Iles Nadia
LPC2ME
Oran 1 University

Nadia
Posts: 2
Joined: Tue Oct 04, 2016 5:23 am

Re: parallel configuration

Post by Nadia » Thu Oct 06, 2016 4:01 pm

Hello ;
According to the log file, we suspect the source of our problem is mover.F90 and NETCDF FILE. Is there any bugs including this files ?
Please, we need your help.
Thanks

Jordan
Posts: 282
Joined: Tue May 07, 2013 9:47 am

Re: parallel configuration

Post by Jordan » Thu Oct 13, 2016 8:19 am

Hi,

Just like this, I have no clue to help you with your error.
Could you try to use the latest 8.0.8b version of abinit ? www.abinit.org

Cheers

Jordan

Locked