Improving calculation performance?
Moderator: amadon
Improving calculation performance?
Hi All,
I'm a new member in the forum; thanks for providing this forum.
Currently, I'm trying to do a groundstate calculation for a nanoribbon system consisting of 30 atoms per cell with the following SCFrelated parameter setup:
# SCF cycle parameters
ecut 12.
pawecutdg 30.
nstep 300
nband 150
toldfe 1.0d9
iprcel 44
# Kpoints and sym
ngkpt 11 1 1
nshiftk 1
shiftk 0.0 0.0 0.0
occopt 7
prtdos 0
autoparal 1
For the potential I'm using PAW potential (this potential is a "must" currently for my study). The problem with my calculation is that it's running so slow. I compared the calculation with VASP using a comparable parameter setup and it's finished in less than 10 minutes, but my calculation with ABINIT has been running for about 12 hours now.
Did I miss some parameters in my setup so it's running so slow? Or is there a way to improve the performance of my calculation? I would appreciate it if someone could give some suggestion regarding this issue. Thanks!
PS. Below I also put the BuildInformation of my ABINIT (in case needed for further analysis):
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
=== Build Information ===
Version : 8.6.1
Build target : x86_64_linux_gnu4.8
Build date : 20180123
=== Compiler Suite ===
C compiler : gnu4.8
C++ compiler : gnu4.8
Fortran compiler : gnu4.8
CFLAGS : g O2 mtune=native march=native
CXXFLAGS : g O2 mtune=native march=native
FCFLAGS : g ffreelinelengthnone
FC_LDFLAGS :
=== Optimizations ===
Debug level : basic
Optimization level : standard
Architecture : intel_xeon
=== Multicore ===
Parallel build : yes
Parallel I/O : auto
openMP support : no
GPU support : no
=== Connectors / Fallbacks ===
Connectors on : yes
Fallbacks on : yes
DFT flavor : libxcfallback+atompawfallback+wannier90fallback
FFT flavor : none
LINALG flavor : netlibfallback
MATH flavor : none
TIMER flavor : abinit
TRIO flavor : none
=== Experimental features ===
Bindings : @enable_bindings@
Exports : no
GW doubleprecision : no
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Default optimizations:
O2 mtune=native march=native
Optimizations for 20_datashare:
O0
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
CPP options activated during the build:
CC_GNU CXX_GNU FC_GNU
HAVE_ATOMPAW HAVE_FC_ALLOCATABLE_DT... HAVE_FC_ASYNC
HAVE_FC_BACKTRACE HAVE_FC_COMMAND_ARGUMENT HAVE_FC_COMMAND_LINE
HAVE_FC_CONTIGUOUS HAVE_FC_CPUTIME HAVE_FC_EXIT
HAVE_FC_FLUSH HAVE_FC_GAMMA HAVE_FC_GETENV
HAVE_FC_INT_QUAD HAVE_FC_IOMSG HAVE_FC_ISO_C_BINDING
HAVE_FC_ISO_FORTRAN_2008 HAVE_FC_LONG_LINES HAVE_FC_MOVE_ALLOC
HAVE_FC_PRIVATE HAVE_FC_PROTECTED HAVE_FC_STREAM_IO
HAVE_FC_SYSTEM HAVE_FORTRAN2003 HAVE_LIBPAW_ABINIT
HAVE_LIBTETRA_ABINIT HAVE_LIBXC HAVE_MPI
HAVE_MPI2 HAVE_MPI_IALLREDUCE HAVE_MPI_IALLTOALL
HAVE_MPI_IALLTOALLV HAVE_MPI_INTEGER16 HAVE_MPI_IO
HAVE_MPI_TYPE_CREATE_S... HAVE_NUMPY HAVE_OS_LINUX
HAVE_TIMER_ABINIT HAVE_WANNIER90 USE_MACROAVE
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
I'm a new member in the forum; thanks for providing this forum.
Currently, I'm trying to do a groundstate calculation for a nanoribbon system consisting of 30 atoms per cell with the following SCFrelated parameter setup:
# SCF cycle parameters
ecut 12.
pawecutdg 30.
nstep 300
nband 150
toldfe 1.0d9
iprcel 44
# Kpoints and sym
ngkpt 11 1 1
nshiftk 1
shiftk 0.0 0.0 0.0
occopt 7
prtdos 0
autoparal 1
For the potential I'm using PAW potential (this potential is a "must" currently for my study). The problem with my calculation is that it's running so slow. I compared the calculation with VASP using a comparable parameter setup and it's finished in less than 10 minutes, but my calculation with ABINIT has been running for about 12 hours now.
Did I miss some parameters in my setup so it's running so slow? Or is there a way to improve the performance of my calculation? I would appreciate it if someone could give some suggestion regarding this issue. Thanks!
PS. Below I also put the BuildInformation of my ABINIT (in case needed for further analysis):
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
=== Build Information ===
Version : 8.6.1
Build target : x86_64_linux_gnu4.8
Build date : 20180123
=== Compiler Suite ===
C compiler : gnu4.8
C++ compiler : gnu4.8
Fortran compiler : gnu4.8
CFLAGS : g O2 mtune=native march=native
CXXFLAGS : g O2 mtune=native march=native
FCFLAGS : g ffreelinelengthnone
FC_LDFLAGS :
=== Optimizations ===
Debug level : basic
Optimization level : standard
Architecture : intel_xeon
=== Multicore ===
Parallel build : yes
Parallel I/O : auto
openMP support : no
GPU support : no
=== Connectors / Fallbacks ===
Connectors on : yes
Fallbacks on : yes
DFT flavor : libxcfallback+atompawfallback+wannier90fallback
FFT flavor : none
LINALG flavor : netlibfallback
MATH flavor : none
TIMER flavor : abinit
TRIO flavor : none
=== Experimental features ===
Bindings : @enable_bindings@
Exports : no
GW doubleprecision : no
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Default optimizations:
O2 mtune=native march=native
Optimizations for 20_datashare:
O0
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
CPP options activated during the build:
CC_GNU CXX_GNU FC_GNU
HAVE_ATOMPAW HAVE_FC_ALLOCATABLE_DT... HAVE_FC_ASYNC
HAVE_FC_BACKTRACE HAVE_FC_COMMAND_ARGUMENT HAVE_FC_COMMAND_LINE
HAVE_FC_CONTIGUOUS HAVE_FC_CPUTIME HAVE_FC_EXIT
HAVE_FC_FLUSH HAVE_FC_GAMMA HAVE_FC_GETENV
HAVE_FC_INT_QUAD HAVE_FC_IOMSG HAVE_FC_ISO_C_BINDING
HAVE_FC_ISO_FORTRAN_2008 HAVE_FC_LONG_LINES HAVE_FC_MOVE_ALLOC
HAVE_FC_PRIVATE HAVE_FC_PROTECTED HAVE_FC_STREAM_IO
HAVE_FC_SYSTEM HAVE_FORTRAN2003 HAVE_LIBPAW_ABINIT
HAVE_LIBTETRA_ABINIT HAVE_LIBXC HAVE_MPI
HAVE_MPI2 HAVE_MPI_IALLREDUCE HAVE_MPI_IALLTOALL
HAVE_MPI_IALLTOALLV HAVE_MPI_INTEGER16 HAVE_MPI_IO
HAVE_MPI_TYPE_CREATE_S... HAVE_NUMPY HAVE_OS_LINUX
HAVE_TIMER_ABINIT HAVE_WANNIER90 USE_MACROAVE
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

 Posts: 41
 Joined: Fri Jun 01, 2018 8:22 am
 Contact:
Re: Improving calculation performance?
Dear Ryne60 and ABINIT friends,
If you state that the calculation is slow, then please specify what you mean with slow. Is it the convergence speed? Or do you need approx. the same amount of cycles, but each cycle is taken more time? Or something else?
In my case, the convergence speed of ABINIT was pretty slow. It took many steps to converge to the ground electronic state and I could improve the convergence speed by a factor of 2 by changing occopt to 7.
Unfortunately you already have set occopt to 7.
toldfe 1.0d9 is pretty harsh. Please check if you specified the same value in VASP. If this is the case, then this should not be the problem.
Furthermore, it might help if you specify (or upload) your PAW potential file. As far as I understood, the convergence also depends on the PAW potential.
If you state that the calculation is slow, then please specify what you mean with slow. Is it the convergence speed? Or do you need approx. the same amount of cycles, but each cycle is taken more time? Or something else?
In my case, the convergence speed of ABINIT was pretty slow. It took many steps to converge to the ground electronic state and I could improve the convergence speed by a factor of 2 by changing occopt to 7.
Unfortunately you already have set occopt to 7.
toldfe 1.0d9 is pretty harsh. Please check if you specified the same value in VASP. If this is the case, then this should not be the problem.
Furthermore, it might help if you specify (or upload) your PAW potential file. As far as I understood, the convergence also depends on the PAW potential.
Re: Improving calculation performance?
Hi Woffermans,
Thanks for the reply! The calculation is slow in terms of time per SCFcycle and convergence speed; it took around 2 hours per SCFcycle and the calculation isn't converged after 300 cycles.
Also please kindly find attached the PAW potential I use! And any further suggestion is welcome!
PS. I renamed the file from Bi.GGA_PBEJTH.xml to BiPAW.in since apparently files are not allowed for attachment except those with suffix .in or .out
Thanks for the reply! The calculation is slow in terms of time per SCFcycle and convergence speed; it took around 2 hours per SCFcycle and the calculation isn't converged after 300 cycles.
Also please kindly find attached the PAW potential I use! And any further suggestion is welcome!
PS. I renamed the file from Bi.GGA_PBEJTH.xml to BiPAW.in since apparently files are not allowed for attachment except those with suffix .in or .out
 Attachments

 BiPAW.in
 PAW potential
 (923.08 KiB) Downloaded 149 times
Re: Improving calculation performance?
I think set toldfe 1e9 is too expensive and generally speaking it's no need to use a so strict criterion. Some times even for calculate the phonon dispersions, most people will set toldfe 1e7. Maybe help you.
Re: Improving calculation performance?
Hi onion2440,
after 300 cycles the maximum energy diff is around 10 Ha, which is >> 1e9 Ha. So even if I increase toldfe to, let's say, 1e7, the overall calculation is still very expensive. Maybe any clue how to improve/speedup the calculation time per cycle?
after 300 cycles the maximum energy diff is around 10 Ha, which is >> 1e9 Ha. So even if I increase toldfe to, let's say, 1e7, the overall calculation is still very expensive. Maybe any clue how to improve/speedup the calculation time per cycle?

 Posts: 41
 Joined: Fri Jun 01, 2018 8:22 am
 Contact:
Re: Improving calculation performance?
Hello ryne60, onion2440 and ABINIT friends,
The point is not the values of the parameters.
The point is the comparison with VASP calculation.
If the VASP input is equivalent to the ABINIT input, then the question remains why calculation speed of VASP and ABINIT is so different.
If I understand you correctly, then you need more time per electronic cycle as well as more cycles for your ABINIT calculation, compared to VASP. Moreover the difference is significant. It is not just a couple of minutes and 3 cycles less or more.
Though benchmarking is always very tricky. You need equal conditions for the VASP and ABINIT calculation. You need equal input for both calculations. Otherwise you compare apples with pears.
I think you need also to upload your VASP input. Experienced users can then have a look to both input and judge if input for ABINIT and VASP is equal.
The best would be, if someone would reproduce your comparison on a different machine/cluster. Just to be sure, that it is not the environment, who's playing tricks on you.
I guess that the developers of ABINIT are doing benchmarks all the time. So I would expect a statement from their site as well. Maybe this is already known and taken into account.
The point is not the values of the parameters.
The point is the comparison with VASP calculation.
If the VASP input is equivalent to the ABINIT input, then the question remains why calculation speed of VASP and ABINIT is so different.
If I understand you correctly, then you need more time per electronic cycle as well as more cycles for your ABINIT calculation, compared to VASP. Moreover the difference is significant. It is not just a couple of minutes and 3 cycles less or more.
Though benchmarking is always very tricky. You need equal conditions for the VASP and ABINIT calculation. You need equal input for both calculations. Otherwise you compare apples with pears.
I think you need also to upload your VASP input. Experienced users can then have a look to both input and judge if input for ABINIT and VASP is equal.
The best would be, if someone would reproduce your comparison on a different machine/cluster. Just to be sure, that it is not the environment, who's playing tricks on you.
I guess that the developers of ABINIT are doing benchmarks all the time. So I would expect a statement from their site as well. Maybe this is already known and taken into account.
Re: Improving calculation performance?
Hi woffermans,
Yes, that's correct; for the comparison, the VASP calculation needed only about 98s per cycle and 40 cycles to converge.
I used the following minimal setup for the VASP calculation:
PREC = Accurate
LREAL = .FALSE.
ISMEAR = 5 # tetrahedron method with Blöchl corrections
NBANDS = 150
NSW = 0 # static calculation
EDIFF = 1.0d7 # in eV
NEDOS = 801
NPAR = 4
ENCUT = 340 eV
In both cases I used 11 kpoints. I can't attach the potcar file because it's a copyright item but if you have access to the VASP POTCAR database I used the POTCAR with the following title: TITEL = PAW_PBE Bi08Apr2002 . So I guess both calculations have comparable setups but I still I don't know why the ABINIT calculation took much more time per cycle (and number of cycles).
woffermans wrote:If I understand you correctly, then you need more time per electronic cycle as well as more cycles for your ABINIT calculation, compared to VASP. Moreover the difference is significant. It is not just a couple of minutes and 3 cycles less or more.
Yes, that's correct; for the comparison, the VASP calculation needed only about 98s per cycle and 40 cycles to converge.
I used the following minimal setup for the VASP calculation:
PREC = Accurate
LREAL = .FALSE.
ISMEAR = 5 # tetrahedron method with Blöchl corrections
NBANDS = 150
NSW = 0 # static calculation
EDIFF = 1.0d7 # in eV
NEDOS = 801
NPAR = 4
ENCUT = 340 eV
In both cases I used 11 kpoints. I can't attach the potcar file because it's a copyright item but if you have access to the VASP POTCAR database I used the POTCAR with the following title: TITEL = PAW_PBE Bi08Apr2002 . So I guess both calculations have comparable setups but I still I don't know why the ABINIT calculation took much more time per cycle (and number of cycles).

 Posts: 41
 Joined: Fri Jun 01, 2018 8:22 am
 Contact:
Re: Improving calculation performance?
Dear ryne60 and ABINIT friends,
Is ISMEAR = 5 comparable with occopt 7?
Is ISMEAR = 5 comparable with occopt 7?
Re: Improving calculation performance?
Hi woffermans,
I also tried with both calculations using the Fermi smearing (ISMEAR 1 & occopt 3) and the conclusion regarding the calculations' speed doesn't not change much!
I also tried with both calculations using the Fermi smearing (ISMEAR 1 & occopt 3) and the conclusion regarding the calculations' speed doesn't not change much!
Re: Improving calculation performance?
Dear ryne60,
Another remark: Why do you use iprcel 44? Did you check if default preconditioning is working?
another remark is that autoparal=1 might not be efficient, I usually recommend to do it by hand such that you control everything and often the result is more efficient, see for example:
https://forum.abinit.org/viewtopic.php?f=8&t=3837
Best wishes,
Eric
Another remark: Why do you use iprcel 44? Did you check if default preconditioning is working?
another remark is that autoparal=1 might not be efficient, I usually recommend to do it by hand such that you control everything and often the result is more efficient, see for example:
https://forum.abinit.org/viewtopic.php?f=8&t=3837
Best wishes,
Eric
Re: Improving calculation performance?
Hi Eric,
I made some trial calculations with similar kind of systems but with a lot smaller size of unit cell; i.e. still 2D systems with 2 atoms per unitcell, and thus the timing was a lot faster. In these cases, the setup with iprcel 44 finished faster than cases with other iprcel values. So I just setup the same value for bigger systems. Moreover, according to the manual iprcel 45 will more likely give a large improvement, and since iprcel 44 is just 1 number lower, so I thought it would give a similar performance as 45. But I also tried using the default value; the conclusion about the speed per cycle is still more less the same; i.e. much slower than VASP's speed per cycle.
Regarding autopar, before I set this parameter out, I got some message from abinit that my setup was not optimal and it recommended "autopar" in the input file. I also again did some test of this parameter on systems with smaller sizes, and autopar 1 resulted in the fastest timing. So I just put it in the setup of my real system.
Anyway, thanks for recommendation. I'll do more tests with autopar.
I made some trial calculations with similar kind of systems but with a lot smaller size of unit cell; i.e. still 2D systems with 2 atoms per unitcell, and thus the timing was a lot faster. In these cases, the setup with iprcel 44 finished faster than cases with other iprcel values. So I just setup the same value for bigger systems. Moreover, according to the manual iprcel 45 will more likely give a large improvement, and since iprcel 44 is just 1 number lower, so I thought it would give a similar performance as 45. But I also tried using the default value; the conclusion about the speed per cycle is still more less the same; i.e. much slower than VASP's speed per cycle.
Regarding autopar, before I set this parameter out, I got some message from abinit that my setup was not optimal and it recommended "autopar" in the input file. I also again did some test of this parameter on systems with smaller sizes, and autopar 1 resulted in the fastest timing. So I just put it in the setup of my real system.
Anyway, thanks for recommendation. I'll do more tests with autopar.

 Posts: 41
 Joined: Fri Jun 01, 2018 8:22 am
 Contact:
Re: Improving calculation performance?
Dear ryne60 and ABINIT friends,
Please don't forget to report on your findings. It is of interest of all ABINIT friends, if we can close the thread in some conclusive way.
Please don't forget to report on your findings. It is of interest of all ABINIT friends, if we can close the thread in some conclusive way.
Re: Improving calculation performance?
Hi woffermans,
it seems the culprit was iprcel; because of this, the first few SCF cycles took so many hours. In my opinion, somehow the following note about iprcel in the online manual is a bit misleading:
Certainly the running time is not improved (or even much worse) compared to the setup without iprcel.
Some other factor that has led to a different performance is the cut off energy of wavefunctions suggested by pseudopotentials, I don't know how it is possible, but in most PAW cases suggested cut off energies are much lower in VASP than in ABINIT. However, if both are set up with comparable cut off energy, the running time becomes comparable too.
Other than this, I think the performance of ABINIT is quite comparable to VASP.
I hope the info can be useful to others.
Thanks again!
it seems the culprit was iprcel; because of this, the first few SCF cycles took so many hours. In my opinion, somehow the following note about iprcel in the online manual is a bit misleading:
For nonhomogeneous relatively large cells iprcel = 45 will likely give a large improvement over iprcel = 0.
Certainly the running time is not improved (or even much worse) compared to the setup without iprcel.
Some other factor that has led to a different performance is the cut off energy of wavefunctions suggested by pseudopotentials, I don't know how it is possible, but in most PAW cases suggested cut off energies are much lower in VASP than in ABINIT. However, if both are set up with comparable cut off energy, the running time becomes comparable too.
Other than this, I think the performance of ABINIT is quite comparable to VASP.
I hope the info can be useful to others.
Thanks again!
Re: Improving calculation performance?
ryne60 wrote:Hi woffermans,
it seems the culprit was iprcel; because of this, the first few SCF cycles took so many hours. In my opinion, somehow the following note about iprcel in the online manual is a bit misleading:For nonhomogeneous relatively large cells iprcel = 45 will likely give a large improvement over iprcel = 0.
Certainly the running time is not improved (or even much worse) compared to the setup without iprcel.
Some other factor that has led to a different performance is the cut off energy of wavefunctions suggested by pseudopotentials, I don't know how it is possible, but in most PAW cases suggested cut off energies are much lower in VASP than in ABINIT. However, if both are set up with comparable cut off energy, the running time becomes comparable too.
Other than this, I think the performance of ABINIT is quite comparable to VASP.
I hope the info can be useful to others.
Thanks again!
Hi ryne60,
In your latest reply, you thought the culprit was "iprcel". So it means that in your "in" file, the line including "iprcel" was deleted and the calculation with ABINIT finished in less than 10 minutes? Is it right?
Re: Improving calculation performance?
Hi ketong,
Yes, it's been deleted since then! But, since my systems are big, the calculation time (scfcv time) is still long but now "comparable" to VASP because the first few scf cycles now take significantly less time.
Yes, it's been deleted since then! But, since my systems are big, the calculation time (scfcv time) is still long but now "comparable" to VASP because the first few scf cycles now take significantly less time.
Re: Improving calculation performance?
Dear ryne60,
Thank you for your information.
In addition, do you have some suggestions or tricks about the setting of "iprcel", "diemac" and "diemix"?
Thanks again!
Thank you for your information.
In addition, do you have some suggestions or tricks about the setting of "iprcel", "diemac" and "diemix"?
Thanks again!
Re: Improving calculation performance?
Hi ketong,
unfortunately, I don't! Sorry, I didn't do much test for those parameters. Since my systems are too big, testing those parameters will be time consuming. I've been using default values, instead.
ketong wrote: In addition, do you have some suggestions or tricks about the setting of "iprcel", "diemac" and "diemix"?
unfortunately, I don't! Sorry, I didn't do much test for those parameters. Since my systems are too big, testing those parameters will be time consuming. I've been using default values, instead.
Re: Improving calculation performance?
Hi ryne60,
Anyway, thank you very much!
Anyway, thank you very much!