Hi all (long time no see!)

can someone provide a rough estimate for the memory needed per processor during the screening calculation and the BSE calculation? For example, running nbands with nkpts on nprocs, how much memory/processor should be available? These calculations seem to be running well except it's very hard to plan ahead how which computer to run on and/or how much memory to request from the queuing system, for a given calculation. Any guidance would be much appreciated.

thanks,

Joe

## Memory use estimate in GW and BSE calculations

**Moderators:** maryam.azizi, bruneval

### Memory use estimate in GW and BSE calculations

Josef W. Zwanziger

Professor, Department of Chemistry

Canada Research Chair in NMR Studies of Materials

Dalhousie University

Halifax, NS B3H 4J3 Canada

jzwanzig@gmail.com

Professor, Department of Chemistry

Canada Research Chair in NMR Studies of Materials

Dalhousie University

Halifax, NS B3H 4J3 Canada

jzwanzig@gmail.com

### Re: Memory use estimate in GW and BSE calculations

Hi Joe, it's good to see you again.

It's possible to obtain an estimate of the memory requirements by using max_ncpus > 0 in the input file.

The code will print in the main output file a Yaml document with the estimated parallel efficiency and the memory requirements

for all parallel configurations up to `max_ncpus`

It works for GS calculations, DFPT, GW and BSE.

In the case of GS runs with paral_kgb=1, you also have the values of npfft, npband etc associated to the different number of MPI processors.

Note that the code exits immediately after the output of the "--- !Autoparal" section, so this feature is not compatible with multiple datasets.

The memory reported is in Mb and is expected to be smaller than the actual size.

The memory requirements in the GW/BSE part can be decreased with fftgw and gwmem (in particular gwmem = 10).

I recommend gwpara=2 in Screening and Sigma since gwpara=1 is not able to distribute the wavefunctions (gwpara=2 is now the default value in Abinit8).

In the BSE code, the size of the excitonic Hamiltonian scales with the number of MPI processes but the wavefunctions are not distributed.

It's possible to bypass the calculation of the SCR by using the model dielectric function (mdf_epsinf).

BTW- I've recently added support for NC pseudos with more than one projector (in the last version of trunk/develop it's possible to

use inclvkb = 2 to compute [Vnl, r] with oncvpsp pseudopotentials). We are also working on oncpvps pseudos for GW calculations.

Best,

Matteo

can someone provide a rough estimate for the memory needed per processor during the screening calculation and the BSE calculation

It's possible to obtain an estimate of the memory requirements by using max_ncpus > 0 in the input file.

The code will print in the main output file a Yaml document with the estimated parallel efficiency and the memory requirements

for all parallel configurations up to `max_ncpus`

It works for GS calculations, DFPT, GW and BSE.

In the case of GS runs with paral_kgb=1, you also have the values of npfft, npband etc associated to the different number of MPI processors.

Note that the code exits immediately after the output of the "--- !Autoparal" section, so this feature is not compatible with multiple datasets.

The memory reported is in Mb and is expected to be smaller than the actual size.

The memory requirements in the GW/BSE part can be decreased with fftgw and gwmem (in particular gwmem = 10).

I recommend gwpara=2 in Screening and Sigma since gwpara=1 is not able to distribute the wavefunctions (gwpara=2 is now the default value in Abinit8).

In the BSE code, the size of the excitonic Hamiltonian scales with the number of MPI processes but the wavefunctions are not distributed.

It's possible to bypass the calculation of the SCR by using the model dielectric function (mdf_epsinf).

BTW- I've recently added support for NC pseudos with more than one projector (in the last version of trunk/develop it's possible to

use inclvkb = 2 to compute [Vnl, r] with oncvpsp pseudopotentials). We are also working on oncpvps pseudos for GW calculations.

Best,

Matteo

### Re: Memory use estimate in GW and BSE calculations

Hi Matteo,

I am interested in the develop version support for NC pseudos with more than one projector. How can I download this version? BTW, can you comment on the GW calculation using PAW? From the tutorial, for GW calculation, PAW with 3 projectors for each angular momentum channel is requited. Therefore one need construct their own PAW to do GW calculations?

Best,

Xiaoming Wang

University of Toledo

I am interested in the develop version support for NC pseudos with more than one projector. How can I download this version? BTW, can you comment on the GW calculation using PAW? From the tutorial, for GW calculation, PAW with 3 projectors for each angular momentum channel is requited. Therefore one need construct their own PAW to do GW calculations?

Best,

Xiaoming Wang

University of Toledo

gmatteo wrote:Hi Joe, it's good to see you again.can someone provide a rough estimate for the memory needed per processor during the screening calculation and the BSE calculation

It's possible to obtain an estimate of the memory requirements by using max_ncpus > 0 in the input file.

The code will print in the main output file a Yaml document with the estimated parallel efficiency and the memory requirements

for all parallel configurations up to `max_ncpus`

It works for GS calculations, DFPT, GW and BSE.

In the case of GS runs with paral_kgb=1, you also have the values of npfft, npband etc associated to the different number of MPI processors.

Note that the code exits immediately after the output of the "--- !Autoparal" section, so this feature is not compatible with multiple datasets.

The memory reported is in Mb and is expected to be smaller than the actual size.

The memory requirements in the GW/BSE part can be decreased with fftgw and gwmem (in particular gwmem = 10).

I recommend gwpara=2 in Screening and Sigma since gwpara=1 is not able to distribute the wavefunctions (gwpara=2 is now the default value in Abinit8).

In the BSE code, the size of the excitonic Hamiltonian scales with the number of MPI processes but the wavefunctions are not distributed.

It's possible to bypass the calculation of the SCR by using the model dielectric function (mdf_epsinf).

BTW- I've recently added support for NC pseudos with more than one projector (in the last version of trunk/develop it's possible to

use inclvkb = 2 to compute [Vnl, r] with oncvpsp pseudopotentials). We are also working on oncpvps pseudos for GW calculations.

Best,

Matteo