Improving calculation performance?

ryne60 · Post by **ryne60** » Thu Dec 06, 2018 9:55 am

Hi All,

I'm a new member in the forum; thanks for providing this forum.

Currently, I'm trying to do a ground-state calculation for a nanoribbon system consisting of 30 atoms per cell with the following SCF-related parameter setup:

# SCF cycle parameters
ecut 12.
pawecutdg 30.
nstep 300
nband 150
toldfe 1.0d-9
iprcel 44

# K-points and sym
ngkpt 11 1 1
nshiftk 1
shiftk 0.0 0.0 0.0

occopt 7
prtdos 0

autoparal 1

For the potential I'm using PAW potential (this potential is a "must" currently for my study). The problem with my calculation is that it's running so slow. I compared the calculation with VASP using a comparable parameter setup and it's finished in less than 10 minutes, but my calculation with ABINIT has been running for about 12 hours now.

Did I miss some parameters in my setup so it's running so slow? Or is there a way to improve the performance of my calculation? I would appreciate it if someone could give some suggestion regarding this issue. Thanks!

PS. Below I also put the Build-Information of my ABINIT (in case needed for further analysis):

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

=== Build Information ===
Version : 8.6.1
Build target : x86_64_linux_gnu4.8
Build date : 20180123

=== Compiler Suite ===
C compiler : gnu4.8
C++ compiler : gnu4.8
Fortran compiler : gnu4.8
CFLAGS : -g -O2 -mtune=native -march=native
CXXFLAGS : -g -O2 -mtune=native -march=native
FCFLAGS : -g -ffree-line-length-none
FC_LDFLAGS :

=== Optimizations ===
Debug level : basic
Optimization level : standard
Architecture : intel_xeon

=== Multicore ===
Parallel build : yes
Parallel I/O : auto
openMP support : no
GPU support : no

=== Connectors / Fallbacks ===
Connectors on : yes
Fallbacks on : yes
DFT flavor : libxc-fallback+atompaw-fallback+wannier90-fallback
FFT flavor : none
LINALG flavor : netlib-fallback
MATH flavor : none
TIMER flavor : abinit
TRIO flavor : none

=== Experimental features ===
Bindings : @enable_bindings@
Exports : no
GW double-precision : no

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Default optimizations:
-O2 -mtune=native -march=native

Optimizations for 20_datashare:
-O0

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
CPP options activated during the build:

CC_GNU CXX_GNU FC_GNU

HAVE_ATOMPAW HAVE_FC_ALLOCATABLE_DT... HAVE_FC_ASYNC

HAVE_FC_BACKTRACE HAVE_FC_COMMAND_ARGUMENT HAVE_FC_COMMAND_LINE

HAVE_FC_CONTIGUOUS HAVE_FC_CPUTIME HAVE_FC_EXIT

HAVE_FC_FLUSH HAVE_FC_GAMMA HAVE_FC_GETENV

HAVE_FC_INT_QUAD HAVE_FC_IOMSG HAVE_FC_ISO_C_BINDING

HAVE_FC_ISO_FORTRAN_2008 HAVE_FC_LONG_LINES HAVE_FC_MOVE_ALLOC

HAVE_FC_PRIVATE HAVE_FC_PROTECTED HAVE_FC_STREAM_IO

HAVE_FC_SYSTEM HAVE_FORTRAN2003 HAVE_LIBPAW_ABINIT

HAVE_LIBTETRA_ABINIT HAVE_LIBXC HAVE_MPI

HAVE_MPI2 HAVE_MPI_IALLREDUCE HAVE_MPI_IALLTOALL

HAVE_MPI_IALLTOALLV HAVE_MPI_INTEGER16 HAVE_MPI_IO

HAVE_MPI_TYPE_CREATE_S... HAVE_NUMPY HAVE_OS_LINUX

HAVE_TIMER_ABINIT HAVE_WANNIER90 USE_MACROAVE

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

woffermans · Post by **woffermans** » Fri Dec 07, 2018 12:23 pm

Dear Ryne60 and ABINIT friends,

If you state that the calculation is slow, then please specify what you mean with slow. Is it the convergence speed? Or do you need approx. the same amount of cycles, but each cycle is taken more time? Or something else?

In my case, the convergence speed of ABINIT was pretty slow. It took many steps to converge to the ground electronic state and I could improve the convergence speed by a factor of 2 by changing occopt to 7.
Unfortunately you already have set occopt to 7.

toldfe 1.0d-9 is pretty harsh. Please check if you specified the same value in VASP. If this is the case, then this should not be the problem.

Furthermore, it might help if you specify (or upload) your PAW potential file. As far as I understood, the convergence also depends on the PAW potential.

ryne60 · Post by **ryne60** » Fri Dec 07, 2018 3:17 pm

Hi Woffermans,

Thanks for the reply! The calculation is slow in terms of time per SCF-cycle and convergence speed; it took around 2 hours per SCF-cycle and the calculation isn't converged after 300 cycles.

Also please kindly find attached the PAW potential I use! And any further suggestion is welcome!

PS. I renamed the file from Bi.GGA_PBE-JTH.xml to BiPAW.in since apparently files are not allowed for attachment except those with suffix .in or .out

onion2440 · Post by **onion2440** » Sun Dec 09, 2018 2:37 am

I think set toldfe 1e-9 is too expensive and generally speaking it's no need to use a so strict criterion. Some times even for calculate the phonon dispersions, most people will set toldfe 1e-7. Maybe help you.

ryne60 · Post by **ryne60** » Mon Dec 10, 2018 10:06 am

Hi onion2440,

after 300 cycles the maximum energy diff is around 10 Ha, which is >> 1e-9 Ha. So even if I increase toldfe to, let's say, 1e-7, the overall calculation is still very expensive. Maybe any clue how to improve/speed-up the calculation time per cycle?

woffermans · Post by **woffermans** » Tue Dec 11, 2018 10:31 am

Hello ryne60, onion2440 and ABINIT friends,

The point is not the values of the parameters.

The point is the comparison with VASP calculation.

If the VASP input is equivalent to the ABINIT input, then the question remains why calculation speed of VASP and ABINIT is so different.

If I understand you correctly, then you need more time per electronic cycle as well as more cycles for your ABINIT calculation, compared to VASP. Moreover the difference is significant. It is not just a couple of minutes and 3 cycles less or more.

Though benchmarking is always very tricky. You need equal conditions for the VASP and ABINIT calculation. You need equal input for both calculations. Otherwise you compare apples with pears.

I think you need also to upload your VASP input. Experienced users can then have a look to both input and judge if input for ABINIT and VASP is equal.

The best would be, if someone would reproduce your comparison on a different machine/cluster. Just to be sure, that it is not the environment, who's playing tricks on you.

I guess that the developers of ABINIT are doing benchmarks all the time. So I would expect a statement from their site as well. Maybe this is already known and taken into account.

ryne60 · Post by **ryne60** » Thu Dec 13, 2018 9:05 am

Hi woffermans,

woffermans wrote:If I understand you correctly, then you need more time per electronic cycle as well as more cycles for your ABINIT calculation, compared to VASP. Moreover the difference is significant. It is not just a couple of minutes and 3 cycles less or more.

Yes, that's correct; for the comparison, the VASP calculation needed only about 98s per cycle and 40 cycles to converge.

I used the following minimal setup for the VASP calculation:

PREC = Accurate
LREAL = .FALSE.
ISMEAR = -5 # tetrahedron method with Blöchl corrections
NBANDS = 150
NSW = 0 # static calculation
EDIFF = 1.0d-7 # in eV
NEDOS = 801
NPAR = 4
ENCUT = 340 eV

In both cases I used 11 kpoints. I can't attach the potcar file because it's a copyright item but if you have access to the VASP POTCAR database I used the POTCAR with the following title: TITEL = PAW_PBE Bi08Apr2002 . So I guess both calculations have comparable setups but I still I don't know why the ABINIT calculation took much more time per cycle (and number of cycles).

woffermans · Post by **woffermans** » Thu Dec 13, 2018 10:37 am

Dear ryne60 and ABINIT friends,

Is ISMEAR = -5 comparable with occopt 7?

ryne60 · Post by **ryne60** » Thu Dec 13, 2018 2:27 pm

Hi woffermans,

I also tried with both calculations using the Fermi smearing (ISMEAR -1 & occopt 3) and the conclusion regarding the calculations' speed doesn't not change much!

ebousquet · Post by **ebousquet** » Fri Dec 14, 2018 4:00 pm

Dear ryne60,
Another remark: Why do you use iprcel 44? Did you check if default preconditioning is working?
another remark is that autoparal=1 might not be efficient, I usually recommend to do it by hand such that you control everything and often the result is more efficient, see for example:
https://forum.abinit.org/viewtopic.php?f=8&t=3837
Best wishes,
Eric

ryne60 · Post by **ryne60** » Fri Dec 14, 2018 5:47 pm

Hi Eric,

I made some trial calculations with similar kind of systems but with a lot smaller size of unit cell; i.e. still 2D systems with 2 atoms per unit-cell, and thus the timing was a lot faster. In these cases, the setup with iprcel 44 finished faster than cases with other iprcel values. So I just setup the same value for bigger systems. Moreover, according to the manual iprcel 45 will more likely give a large improvement, and since iprcel 44 is just 1 number lower, so I thought it would give a similar performance as 45. But I also tried using the default value; the conclusion about the speed per cycle is still more less the same; i.e. much slower than VASP's speed per cycle.

Regarding autopar, before I set this parameter out, I got some message from abinit that my setup was not optimal and it recommended "autopar" in the input file. I also again did some test of this parameter on systems with smaller sizes, and autopar 1 resulted in the fastest timing. So I just put it in the setup of my real system.

Anyway, thanks for recommendation. I'll do more tests with autopar.

woffermans · Post by **woffermans** » Wed Jan 09, 2019 3:12 pm

Dear ryne60 and ABINIT friends,

Please don't forget to report on your findings. It is of interest of all ABINIT friends, if we can close the thread in some conclusive way.

ryne60 · Post by **ryne60** » Fri Jan 11, 2019 1:02 pm

Hi woffermans,

it seems the culprit was iprcel; because of this, the first few SCF cycles took so many hours. In my opinion, somehow the following note about iprcel in the online manual is a bit misleading:

For non-homogeneous relatively large cells iprcel = 45 will likely give a large improvement over iprcel = 0.

Certainly the running time is not improved (or even much worse) compared to the setup without iprcel.

Some other factor that has led to a different performance is the cut off energy of wavefunctions suggested by pseudopotentials, I don't know how it is possible, but in most PAW cases suggested cut off energies are much lower in VASP than in ABINIT. However, if both are set up with comparable cut off energy, the running time becomes comparable too.

Other than this, I think the performance of ABINIT is quite comparable to VASP.

I hope the info can be useful to others.

Thanks again!

ketong · Post by **ketong** » Thu Apr 25, 2019 11:10 am

ryne60 wrote:Hi woffermans,

it seems the culprit was iprcel; because of this, the first few SCF cycles took so many hours. In my opinion, somehow the following note about iprcel in the online manual is a bit misleading:

For non-homogeneous relatively large cells iprcel = 45 will likely give a large improvement over iprcel = 0.

Certainly the running time is not improved (or even much worse) compared to the setup without iprcel.

Some other factor that has led to a different performance is the cut off energy of wavefunctions suggested by pseudopotentials, I don't know how it is possible, but in most PAW cases suggested cut off energies are much lower in VASP than in ABINIT. However, if both are set up with comparable cut off energy, the running time becomes comparable too.

Other than this, I think the performance of ABINIT is quite comparable to VASP.

I hope the info can be useful to others.

Thanks again!

Hi ryne60,

In your latest reply, you thought the culprit was "iprcel". So it means that in your "in" file, the line including "iprcel" was deleted and the calculation with ABINIT finished in less than 10 minutes? Is it right?

ryne60 · Post by **ryne60** » Thu Apr 25, 2019 12:46 pm

Hi ketong,

Yes, it's been deleted since then! But, since my systems are big, the calculation time (scfcv time) is still long but now "comparable" to VASP because the first few scf cycles now take significantly less time.

ketong · Post by **ketong** » Fri Apr 26, 2019 6:44 am

Dear ryne60,

Thank you for your information.

In addition, do you have some suggestions or tricks about the setting of "iprcel", "diemac" and "diemix"?

Thanks again!

ryne60 · Post by **ryne60** » Fri Apr 26, 2019 10:58 am

Hi ketong,

ketong wrote: In addition, do you have some suggestions or tricks about the setting of "iprcel", "diemac" and "diemix"?

unfortunately, I don't! Sorry, I didn't do much test for those parameters. Since my systems are too big, testing those parameters will be time consuming. I've been using default values, instead.

ketong · Post by **ketong** » Tue Apr 30, 2019 4:33 am

Hi ryne60,

Anyway, thank you very much!

ABINIT Discussion Forums

Improving calculation performance?

Improving calculation performance?

Re: Improving calculation performance?

Re: Improving calculation performance?

Re: Improving calculation performance?

Re: Improving calculation performance?

Re: Improving calculation performance?

Re: Improving calculation performance?

Re: Improving calculation performance?

Re: Improving calculation performance?

Re: Improving calculation performance?

Re: Improving calculation performance?

Re: Improving calculation performance?

Re: Improving calculation performance?

Re: Improving calculation performance?

Re: Improving calculation performance?

Re: Improving calculation performance?

Re: Improving calculation performance?

Re: Improving calculation performance?