Resuming stopped DDB calculation  [SOLVED]

Phonons, DFPT, electron-phonon, electric-field response, mechanical response…

Moderators: mverstra, joaocarloscabreu

Locked
aliwho
Posts: 8
Joined: Mon Oct 31, 2016 3:56 am

Resuming stopped DDB calculation

Post by aliwho » Mon Oct 31, 2016 4:38 am

Hello,

I am working on a calculating NLO responses in a system with 40 atoms. Following the tutorial on NLO, I have split my calculation into 5 datasets: "(1) self-consistent calculation in the IBZ; (2) non self-consistent calculations to get the wave-functions over the full BZ; (3) ddk calculation, (4) derivatives with respect to electric field and atomic displacements," and (5) the NLO step. From this, I expect two DDBs: one from step (4) and one from step (5). I have had success doing this (and the subsequent strain response calculation) for smaller cells.
From these calculations, I get 3 DDB files that I can combine with Mrgddb and analyze with Anaddb.

However, the calculations for the 40 atom cell I am working with are taking a very long time--longer than the maximum allowed job run time on the supercomputer I am working on. The first 3 datasets complete successfully, but I run out of time in dataset 4. Specifically, the response due to the displacement of each atom completes within the time limit. The code then tries to calculate the responses due to an electric field, but it runs out of time.

Can I resume this calculation from where it ended or slightly before where it was ended? This would be the best situation, since it would not require me to redo everything.

If not, can I split the responses due the atomic displacements and an electric field into two data sets? If so, would these two data sets each return a DDB? Could I then combine them together and have the same DDB as I would have gotten from calculating the responses in the same dataset (as in step (4) above)?

Thanks,
-Ali

ebousquet
Posts: 469
Joined: Tue Apr 19, 2011 11:13 am
Location: University of Liege, Belgium

Re: Resuming stopped DDB calculation  [SOLVED]

Post by ebousquet » Wed Nov 30, 2016 9:26 am

Dear Ali,
You can indeed restart your calculation without redoing the whole 1, 2, 3 steps.
If you have a too big system, mostly if doing NLO, I recommend you to run the different datasets separately and then merge the necessary information. To that end you have to read the wave-functions, ddk and first order wave functions by using "ird" (irdwf, irdddk or ird1wf) instead of "get". To do so you have to rename your -o_ files to -i_ files depending on how you defined them in your .files file.
If your system is too big, it is even possible that one dadaset will not go through and if so, you might have to cut your calculation by running the different perturbations in separate calculations and merge at the end. You can also run the atom displacement perturbations (rfatpol) separately and merge at the end.
Let me know if you got it.
Best wishes,
Eric

aliwho
Posts: 8
Joined: Mon Oct 31, 2016 3:56 am

Re: Resuming stopped DDB calculation

Post by aliwho » Mon Dec 05, 2016 4:31 am

ebousquet wrote:Dear Ali,
You can indeed restart your calculation without redoing the whole 1, 2, 3 steps.
If you have a too big system, mostly if doing NLO, I recommend you to run the different datasets separately and then merge the necessary information. To that end you have to read the wave-functions, ddk and first order wave functions by using "ird" (irdwf, irdddk or ird1wf) instead of "get". To do so you have to rename your -o_ files to -i_ files depending on how you defined them in your .files file.
If your system is too big, it is even possible that one dadaset will not go through and if so, you might have to cut your calculation by running the different perturbations in separate calculations and merge at the end. You can also run the atom displacement perturbations (rfatpol) separately and merge at the end.
Let me know if you got it.
Best wishes,
Eric


Thanks Eric, I think I understand the necessary steps now!

Best,
-Ali

shadimeshkat
Posts: 1
Joined: Wed Nov 13, 2019 3:35 pm

Restarting Nlo calculation

Post by shadimeshkat » Thu Nov 14, 2019 4:03 pm

Hi all, I am doing the nonlinear calculation for a system of 300 atoms and eventually 900 atoms. I am using the tutorial on abinit website and I have set up the three Databases. Although I am using 1200 CPUs for 24 hours, the calculation is not converged in that timeframe and the input_DS5_DDB is not generated. So I think I need to be able to restart the calculation. I have tried running each dataset separately, and I could successfully get all four datasets converged in a reasonable timeframe like 5 hours on 40 CPUs (with input_DS4_DDB file generated), but the dataset 5 calculation (non-linear calculations)is not completed. Could anybody help me figure out how to get input_DS5_DDB file for a large system like mine?
I have attached my input file.
Thank you.
Shadi
Attachments
input.in
My input file
(13.69 KiB) Downloaded 203 times

ebousquet
Posts: 469
Joined: Tue Apr 19, 2011 11:13 am
Location: University of Liege, Belgium

Re: Resuming stopped DDB calculation

Post by ebousquet » Thu Nov 21, 2019 3:24 pm

Dear Shadi,
When your are limited by a wallclock and could not get the calculation finished before that time limit, please have a look on how many SCF steps the calculation have done in the time limit you have (says N) and re-run it by setting nstep= N-2 or so. This will ensure that the calculation reach the nstep and thus it will write the different WF output files (if prtwf=1). Then you can re-run the calculation by reading the previous WF file(s) (setting the corresponding ird flag and changing the output WF file name into input file name).
I'm not sure there exist a flag to print the WF files every SCF step or so (prtwf=-1 does not sound to this case) and if there exist an Abipy script that handle that?
All the best,
Eric

Locked