Parallelization and convergence
Parallelization and convergence
I am trying to get the attached input program to run faster and with greater convergence and without an error, I tried mpirun which works but only improves the speed by 35% and doesn't improve the convergence. Also, I tried many combinations and values for npband and bandpp based on the output from a run with autoparal = 1 but in every case, after 5 iterations, I get the error message
Src_file: m_lobpcg2.F90
Src_line: 525
Mpi_rank: 0
Message: I'm so sorry I couldn't make it, I did my best but I failed. Sorry. I am gonna suicide.
I attached the error file, logs041720par21 for your information.
Can you please tell me what I am doing wrong in my attempt to get the program to parallelize. (The server I am using has 24 cores and 2 threads/core)
About convergence,  another version of the program running on just 1 CPU reaches tolvrs ~1e10 after about 120 steps, but then oscillates in the 1e10 and 1e11 range for the remaining 360 steps. nstep = 480. The accuracy criterion is set to tolvrs 1e12. I was hoping by using a large block size in parallelizing the input file, I would improve the convergence, but I get the above error instead.
Please help!!
Eliezer
Src_file: m_lobpcg2.F90
Src_line: 525
Mpi_rank: 0
Message: I'm so sorry I couldn't make it, I did my best but I failed. Sorry. I am gonna suicide.
I attached the error file, logs041720par21 for your information.
Can you please tell me what I am doing wrong in my attempt to get the program to parallelize. (The server I am using has 24 cores and 2 threads/core)
About convergence,  another version of the program running on just 1 CPU reaches tolvrs ~1e10 after about 120 steps, but then oscillates in the 1e10 and 1e11 range for the remaining 360 steps. nstep = 480. The accuracy criterion is set to tolvrs 1e12. I was hoping by using a large block size in parallelizing the input file, I would improve the convergence, but I get the above error instead.
Please help!!
Eliezer
Last edited by erichmond on Sat Apr 25, 2020 6:29 am, edited 3 times in total.
Re: Parallelization and convergence
Dear Eliezer,
For the convergence of the SCF you have to play with preconditioning and mixing parameters: see diemix, diemixmag (if magnetic), nline, diemac as a 1st attempt.
I see you have diemix=1 but it is strongly not advised to put 1! Reduce it should probably help your convergence problem.
Best wishes,
Eric
For the convergence of the SCF you have to play with preconditioning and mixing parameters: see diemix, diemixmag (if magnetic), nline, diemac as a 1st attempt.
I see you have diemix=1 but it is strongly not advised to put 1! Reduce it should probably help your convergence problem.
Best wishes,
Eric
Re: Parallelization and convergence
Eric,
Thank you for your response.
I should mention that these calculations are on a slab of 3 layers of Pd atoms and 7 layers of vacuum. I am not yet treating the system for its magnetic properties.
My input file already has the following statements
iprcel 45
iscf 7
npulayit 30
diemac 1e+8
diemix 1
nnsclo 3
nband 48
nline 6
optforces 0
nbdbuf 6
in order to improve the convergence and speed, and it works to a point.
I will try diemix = 0.5 in a run. I also included dielng = 0.25.
But I still have the problem parallelizing my program. The parallelization is complicated by the error I get.
Eliezer
Thank you for your response.
I should mention that these calculations are on a slab of 3 layers of Pd atoms and 7 layers of vacuum. I am not yet treating the system for its magnetic properties.
My input file already has the following statements
iprcel 45
iscf 7
npulayit 30
diemac 1e+8
diemix 1
nnsclo 3
nband 48
nline 6
optforces 0
nbdbuf 6
in order to improve the convergence and speed, and it works to a point.
I will try diemix = 0.5 in a run. I also included dielng = 0.25.
But I still have the problem parallelizing my program. The parallelization is complicated by the error I get.
Eliezer
Re: Parallelization and convergence
Dear Eliezer,
Another problem I did not notice is that the ecut you use is way too small (you start with ecut=6).
For normconserving pseudopotentials from pseudodojo project, the ecut can be between 25 and 50. For PAW it can be between 14 and 30.
If the ecut is too small, the program can run into troubles to converge the SCF.
Best wishes,
Eric
Another problem I did not notice is that the ecut you use is way too small (you start with ecut=6).
For normconserving pseudopotentials from pseudodojo project, the ecut can be between 25 and 50. For PAW it can be between 14 and 30.
If the ecut is too small, the program can run into troubles to converge the SCF.
Best wishes,
Eric
Re: Parallelization and convergence
Eric,
The object of my computations is to develop high accuracy workfunctions for transition metals as a function of ecut = 6  60, spin, surface relaxation, Fermi energy, and slab thickness. So far I have only been working with variations as a function of the number of plane waves for Ni and Pd. When I relaxed a bulk structure as a function of ngkpt, tsmear, and acell ABINIT chose a value of ecut ~ 60.
Up to this point, I have been using the pseudopotentials from the ONCVPSPPBEPDv0.4 library. Is pseudodojo the same as ONCVPSPPBEPDv0.4? If so, I have a problem many of my results are unreliable. So far the computations on the slab for 6  40 have converged to tolvrs 1e12.
The addition of diemix 0.5 and dielng 0.25 did improve the convergence slightly where tolvrs ~ 5e12 after 240 iterations instead of tolvrs ~ 1e11, but still not at tolvrs 1e12. What other preconditioning variables can I try that I am not already using?
Eliezer
PS: I computed ETOT for diemix = .2, .4, .6, .8, 1.0 with all the other parameters the same. For all values of diemix for sufficiently high iterations, ETOT oscillated in the range 3E12< ETOT < 2e11 Ha. For 0.6, 0.8, 1.0 the oscillations started around 40 interactions. For 0.2 and 0.4, the oscillations occurred around 100 and 60 iterations respectively.
The object of my computations is to develop high accuracy workfunctions for transition metals as a function of ecut = 6  60, spin, surface relaxation, Fermi energy, and slab thickness. So far I have only been working with variations as a function of the number of plane waves for Ni and Pd. When I relaxed a bulk structure as a function of ngkpt, tsmear, and acell ABINIT chose a value of ecut ~ 60.
Up to this point, I have been using the pseudopotentials from the ONCVPSPPBEPDv0.4 library. Is pseudodojo the same as ONCVPSPPBEPDv0.4? If so, I have a problem many of my results are unreliable. So far the computations on the slab for 6  40 have converged to tolvrs 1e12.
The addition of diemix 0.5 and dielng 0.25 did improve the convergence slightly where tolvrs ~ 5e12 after 240 iterations instead of tolvrs ~ 1e11, but still not at tolvrs 1e12. What other preconditioning variables can I try that I am not already using?
Eliezer
PS: I computed ETOT for diemix = .2, .4, .6, .8, 1.0 with all the other parameters the same. For all values of diemix for sufficiently high iterations, ETOT oscillated in the range 3E12< ETOT < 2e11 Ha. For 0.6, 0.8, 1.0 the oscillations started around 40 interactions. For 0.2 and 0.4, the oscillations occurred around 100 and 60 iterations respectively.
Re: Parallelization and convergence
What do you mean by "Abinit chose ecut~60"? Do you mean that your convergence tests with ecut give a good resolution on the property you are looking at for ecut=60?erichmond wrote: ↑Wed Apr 22, 2020 11:29 pmThe object of my computations is to develop high accuracy workfunctions for transition metals as a function of ecut = 6  60, spin, surface relaxation, Fermi energy, and slab thickness. So far I have only been working with variations as a function of the number of plane waves for Ni and Pd. When I relaxed a bulk structure as a function of ngkpt, tsmear, and acell ABINIT chose a value of ecut ~ 60.
Yes, ONCVPSPPBEPDv0.4 is the same as what you have on pseudodojo.erichmond wrote: ↑Wed Apr 22, 2020 11:29 pmUp to this point, I have been using the pseudopotentials from the ONCVPSPPBEPDv0.4 library. Is pseudodojo the same as ONCVPSPPBEPDv0.4? If so, I have a problem many of my results are unreliable. So far the computations on the slab for 6  40 have converged to tolvrs 1e12.
What do you mean unreliable, is it because you cannot go below 1E12 on the density/potential resolution?
Maybe you are in a more pathological case...?erichmond wrote: ↑Wed Apr 22, 2020 11:29 pmThe addition of diemix 0.5 and dielng 0.25 did improve the convergence slightly where tolvrs ~ 5e12 after 240 iterations instead of tolvrs ~ 1e11, but still not at tolvrs 1e12. What other preconditioning variables can I try that I am not already using?
Eliezer
PS: I computed ETOT for diemix = .2, .4, .6, .8, 1.0 with all the other parameters the same. For all values of diemix for sufficiently high iterations, ETOT oscillated in the range 3E12< ETOT < 2e11 Ha. For 0.6, 0.8, 1.0 the oscillations started around 40 interactions. For 0.2 and 0.4, the oscillations occurred around 100 and 60 iterations respectively.
OK for diemix, sounds like a too small value just shifts the problem to higher number of SCF.
You used iprcel 45, did test if this is indeed the best? What about without it?
You can increase nline to 10 in case, it can be less computer demanding than increasing nnsclo (leave it by default and increase nline first).
Another remark, how many CPU are you using because in your input you put npfft 4 npband 24 npkpt 4 bandpp 2, meaning that you want to run on npfft*npband*npkpt=384 CPU times two threads per CPU (bandpp), is it correct?
Best wishes,
Eric
Re: Parallelization and convergence
Eric,
I appreciate your detailed response. I'll respond item by item.
In the output of a relaxation run where I varied ngkpt[ 2 2 2, 4 4 4, 6 6 6, 8 8 8, 10 10 10] and tsmear [.0001 to .0005 by .0001 steps], there appeared, at the beginning of a dataset and right before the pseudopotential description, the following
Angles(23, 13, 12)= 90 90 120
getcut: wavevector= 0.0 0.0 0.0 ngfft= 36 36 120
ecut(hartree)= 58.190 => boxcut(ratio) 2.01652
Normally, here the value of ecut just equals the value I insert in the input file. This is where I obtained the value of ecut for further computations. The input value of ecut was 44,
I jumped to the conclusion that my computations for the 1DM files [ecut = 6, 8, 10, 12, 14, 16, 18, 10, 22, 24 and 52, 52, 56, 58, and 60] were unreliable because I was using the pseudopotential in a forbidden range, and therefore couldn't be relied upon.
I might mention for bulk computations, a unit cell without vacuum, all computations for ecut = 6  60 converged, and for slab calculations, 3 atom layers and 7 vacuum layers, with ecut <= 40 all converged to tolvrs = 1e12,. So I don't know why slab computations don't converge for ecut = 4260.
With regard to iprcel = 45, I found that idea in the base Tutorial tbase4_7 where he recommended using this input term over dielng and diemax to precondition the computation when dealing with Al layers and vacuum. The writeup on iprcel is too brief to give me an intelligent way to vary iprcel. In that tutorial, the author also suggests using diemac = 35 and dielng = 0.2 for the Alvacuum system. I tried this latter computation with diemax = 4 and dielng = 0.2 and after 240 iterations the last five iterations of tolvrs gave an average of 1.2 e11 which is higher than the variations of diemix which was in the range of 79 e12.
My early computations did not include iprcel,diemix, diemac, npulayit, nline, optforces, nbdbuf, or dielng, and the convergence for some slab runs, which didn't converge to tolvrs = 1 e12, were orders of magnitude worse.
I am currently running computations where I vary diemac (1e5 to 1e9 by factors of 10)and nline (4, 5, 6, 7, 8, 9, 10). Each computation takes ~9s/iteration so the runs are taking days. The results so far are
diemac = 1 e+5 Average of last 5 iterations of tolvrs = 1.28 e11 Iteration where tolvrs ~e10 = 85
nline = 4 Average of last 5 iterations of tolvrs = 1.15 e11 Iteration where tolvrs ~e10 = 40
The programs for these tests are still running and will take days to complete if not weeks.
The server I am using has 24 CPU's with 2 threads each. Therefore I made npband = 24 and bandpp = 2 to generate the largest block. I ran an autoparal run with the following results
npband 6 8 6 6 6
npfft 8 6 6 8 6
bandpp 1 1 1 2 2
where the weights decreased from left too right. I am not sure where I obtained the values for npfft and npkpt which I used.
Thanks again for your help.
Eliezer
I appreciate your detailed response. I'll respond item by item.
In the output of a relaxation run where I varied ngkpt[ 2 2 2, 4 4 4, 6 6 6, 8 8 8, 10 10 10] and tsmear [.0001 to .0005 by .0001 steps], there appeared, at the beginning of a dataset and right before the pseudopotential description, the following
Angles(23, 13, 12)= 90 90 120
getcut: wavevector= 0.0 0.0 0.0 ngfft= 36 36 120
ecut(hartree)= 58.190 => boxcut(ratio) 2.01652
Normally, here the value of ecut just equals the value I insert in the input file. This is where I obtained the value of ecut for further computations. The input value of ecut was 44,
I jumped to the conclusion that my computations for the 1DM files [ecut = 6, 8, 10, 12, 14, 16, 18, 10, 22, 24 and 52, 52, 56, 58, and 60] were unreliable because I was using the pseudopotential in a forbidden range, and therefore couldn't be relied upon.
I might mention for bulk computations, a unit cell without vacuum, all computations for ecut = 6  60 converged, and for slab calculations, 3 atom layers and 7 vacuum layers, with ecut <= 40 all converged to tolvrs = 1e12,. So I don't know why slab computations don't converge for ecut = 4260.
With regard to iprcel = 45, I found that idea in the base Tutorial tbase4_7 where he recommended using this input term over dielng and diemax to precondition the computation when dealing with Al layers and vacuum. The writeup on iprcel is too brief to give me an intelligent way to vary iprcel. In that tutorial, the author also suggests using diemac = 35 and dielng = 0.2 for the Alvacuum system. I tried this latter computation with diemax = 4 and dielng = 0.2 and after 240 iterations the last five iterations of tolvrs gave an average of 1.2 e11 which is higher than the variations of diemix which was in the range of 79 e12.
My early computations did not include iprcel,diemix, diemac, npulayit, nline, optforces, nbdbuf, or dielng, and the convergence for some slab runs, which didn't converge to tolvrs = 1 e12, were orders of magnitude worse.
I am currently running computations where I vary diemac (1e5 to 1e9 by factors of 10)and nline (4, 5, 6, 7, 8, 9, 10). Each computation takes ~9s/iteration so the runs are taking days. The results so far are
diemac = 1 e+5 Average of last 5 iterations of tolvrs = 1.28 e11 Iteration where tolvrs ~e10 = 85
nline = 4 Average of last 5 iterations of tolvrs = 1.15 e11 Iteration where tolvrs ~e10 = 40
The programs for these tests are still running and will take days to complete if not weeks.
The server I am using has 24 CPU's with 2 threads each. Therefore I made npband = 24 and bandpp = 2 to generate the largest block. I ran an autoparal run with the following results
npband 6 8 6 6 6
npfft 8 6 6 8 6
bandpp 1 1 1 2 2
where the weights decreased from left too right. I am not sure where I obtained the values for npfft and npkpt which I used.
Thanks again for your help.
Eliezer
Re: Parallelization and convergence
Dear Eliezer,
OK, sounds like it is a bit pathological case (metallic multilayers in vacuum are often harder to converge), thought reaching tolvrs to 1E12 is not bad at all! Which property are you looking for in this system that needs to go beyond that?
Can you show what a grep ETOT gives in one of your cases that are not OK, such that I can see if it has some strange behavior before?
Otherwise, playing with the different preconditioners/mixing parameters is the way to go...
The iprcell is not highly described, indeed, but you can find some more technical discussion in the following paper:
https://journals.aps.org/prb/abstract/1 ... .78.045126
Best wishes,
Eric
OK, sounds like it is a bit pathological case (metallic multilayers in vacuum are often harder to converge), thought reaching tolvrs to 1E12 is not bad at all! Which property are you looking for in this system that needs to go beyond that?
Can you show what a grep ETOT gives in one of your cases that are not OK, such that I can see if it has some strange behavior before?
Otherwise, playing with the different preconditioners/mixing parameters is the way to go...
The iprcell is not highly described, indeed, but you can find some more technical discussion in the following paper:
https://journals.aps.org/prb/abstract/1 ... .78.045126
Best wishes,
Eric
Re: Parallelization and convergence
Eric,
I am sorry to hear that you might think that my problem is pathological, which sounds like a black hole for unsolvable problems. Nonetheless, I will continue my preconditioning investigations.
Attached fine the file "PreConditioner parameter evaluation", which gives the results I have found so far. To give you a frame of reference, the constant parameters for each run are iprcel 45, dielng .25, diemac 1e+8, diemix .5, iscf 7, npulayit 30, nnsclo 3, nline 6, optforces 0, nbdbuf 20, nband 144. Each table in the attached file has 4 columns: first giving the value of the parameter being tested; second, giving the average and standard deviation of the 236 to 240 iteration values of vres; third the 240th vres value; and last, the iteration where vres first reaches the e10 range. I consider the 2nd and 4th columns to be significant because the standard deviation of vres gives some idea of how noisy vres is at the 240th iteration; and, the 4th column gives some idea of how fast that particular run is converging. Based on these criteria I choose nline 7, diemix 0.6, and diemac 1 e+6. The experiment trying diemac 4 and dielng 0.2 instead of iprcel 45 gave about the same results as iprcel 45.
I am going to test iprcel for values 0, 25, 35, 45, 55, 65, 145 and dielng 0.2 to 1.0 in steps of ,2.
Also I have attached the ETOT file " ETOTnline042820.out", as you requested, for a test of nline 6 and 7. For the second dataset 2 (nline 7), the program quit running, for some unknown reason, after 103 iterations. There were no comments in the log file.
Eliezer
I am sorry to hear that you might think that my problem is pathological, which sounds like a black hole for unsolvable problems. Nonetheless, I will continue my preconditioning investigations.
Attached fine the file "PreConditioner parameter evaluation", which gives the results I have found so far. To give you a frame of reference, the constant parameters for each run are iprcel 45, dielng .25, diemac 1e+8, diemix .5, iscf 7, npulayit 30, nnsclo 3, nline 6, optforces 0, nbdbuf 20, nband 144. Each table in the attached file has 4 columns: first giving the value of the parameter being tested; second, giving the average and standard deviation of the 236 to 240 iteration values of vres; third the 240th vres value; and last, the iteration where vres first reaches the e10 range. I consider the 2nd and 4th columns to be significant because the standard deviation of vres gives some idea of how noisy vres is at the 240th iteration; and, the 4th column gives some idea of how fast that particular run is converging. Based on these criteria I choose nline 7, diemix 0.6, and diemac 1 e+6. The experiment trying diemac 4 and dielng 0.2 instead of iprcel 45 gave about the same results as iprcel 45.
I am going to test iprcel for values 0, 25, 35, 45, 55, 65, 145 and dielng 0.2 to 1.0 in steps of ,2.
Also I have attached the ETOT file " ETOTnline042820.out", as you requested, for a test of nline 6 and 7. For the second dataset 2 (nline 7), the program quit running, for some unknown reason, after 103 iterations. There were no comments in the log file.
Eliezer
 Attachments

 PreConditioner parameter evaluation.docx
 (11.72 KiB) Downloaded 11 times

 ETOTnline042820.out
 (20.72 KiB) Downloaded 11 times
Re: Parallelization and convergence
Dear Eliezer,
Thank you for your feedback. Regarding the convergence tests of diemac, diemix, etc it looks like nline and diemac does not change anything and diemix=0.8 is the best value. The SCF is quite smooth without bumps such that this is quite good.
For iprcell, I don't have much experience with it, just one inhomogeneous dielectric system (BaO/BaTiO3 superlattice) which was not converging with most usual parameters (400 SCF to converge) but worked fantastically well with iprcell=41 (10 SCF to converge).
Now, I come back to my question, do you really need to go below 1E10 or 1E11 on tolvrs? When looking at your grep ETOT file, we can see that tolvrs is indeed stuck at 1E11 but the residual on the energy (first column) is very well converged (1E10/1E11) as well as the residual on the wavefunction (second column, 1E27). So, did you test physical properties you are interested in vs tolvrs, because maybe 1E11 this is way enough?
Best wishes,
Eric
Thank you for your feedback. Regarding the convergence tests of diemac, diemix, etc it looks like nline and diemac does not change anything and diemix=0.8 is the best value. The SCF is quite smooth without bumps such that this is quite good.
For iprcell, I don't have much experience with it, just one inhomogeneous dielectric system (BaO/BaTiO3 superlattice) which was not converging with most usual parameters (400 SCF to converge) but worked fantastically well with iprcell=41 (10 SCF to converge).
Now, I come back to my question, do you really need to go below 1E10 or 1E11 on tolvrs? When looking at your grep ETOT file, we can see that tolvrs is indeed stuck at 1E11 but the residual on the energy (first column) is very well converged (1E10/1E11) as well as the residual on the wavefunction (second column, 1E27). So, did you test physical properties you are interested in vs tolvrs, because maybe 1E11 this is way enough?
Best wishes,
Eric
Re: Parallelization and convergence
Eric,
I completed the tests on iprcel and dielng which are included in an updated version of "PreConditioner parameter evaluation" (attached). Also I put the optimum choices of the preconditioning parameters in a slab run with nstep = 2000. The results were better (iteration 19962000 =>4.72e12+2.69e12, but vres had still not reached 1e12. Perhaps 4000 iterations would work, but that seems excessive.
I also included in the "PreConditioner ,,," attachment two plots. The first plot is one of workfunction vs ecut and the second is a plot of the Fermi energy vs. ecut. It is obvious for ecut< 25 there is a sharp decrease in the workfunction and Fermi energy. These results are why I need to know whether these results are Valid for ecut < 25. If you don't know the answer, could you please point me to someone or some paper that could help.
Lastly, I have not yet, but will try tonight, to compare the workfunction vs potential residual for vres = 10e9, 10, 11, 12 and 13 for say nstep = 500. ETOT converges for ecut < 42, so I will set ecut = 38 or 40.
Thank you for your continuing assistance,
Eliezer
I completed the tests on iprcel and dielng which are included in an updated version of "PreConditioner parameter evaluation" (attached). Also I put the optimum choices of the preconditioning parameters in a slab run with nstep = 2000. The results were better (iteration 19962000 =>4.72e12+2.69e12, but vres had still not reached 1e12. Perhaps 4000 iterations would work, but that seems excessive.
I also included in the "PreConditioner ,,," attachment two plots. The first plot is one of workfunction vs ecut and the second is a plot of the Fermi energy vs. ecut. It is obvious for ecut< 25 there is a sharp decrease in the workfunction and Fermi energy. These results are why I need to know whether these results are Valid for ecut < 25. If you don't know the answer, could you please point me to someone or some paper that could help.
Lastly, I have not yet, but will try tonight, to compare the workfunction vs potential residual for vres = 10e9, 10, 11, 12 and 13 for say nstep = 500. ETOT converges for ecut < 42, so I will set ecut = 38 or 40.
Thank you for your continuing assistance,
Eliezer
 Attachments

 PreConditioner parameter evaluation.docx
 (27.65 KiB) Downloaded 13 times
Re: Parallelization and convergence
Dear Eliezer,
You could also make a second plot with [(value(ecut)value(ecut_max))/value(ecut_max)]*100 vs ecut to have it like a percentage of error so that you can say you have a precision of 1% at ecut=XX, 10% at ecut=YY, etc. This will help you to decide which ecut you have to use at the end.
The same applies vs tolvrs.
Best wishes,
Eric
more than a few 100 steps is already excessive and a few 1000 sounds not to help a lot here (I mean, the precision gained is not worth the CPU cost).I completed the tests on iprcel and dielng which are included in an updated version of "PreConditioner parameter evaluation" (attached). Also I put the optimum choices of the preconditioning parameters in a slab run with nstep = 2000. The results were better (iteration 19962000 =>4.72e12+2.69e12, but vres had still not reached 1e12. Perhaps 4000 iterations would work, but that seems excessive.
To help in visualizing your convergence tests, you could plot the residual instead of the total value vs ecut (by residual I mean subtracting all results by the best converged case, i.e. the highest ecut you went)? A log scale on the y axis might be helpful for this convergence test vs ecut.I also included in the "PreConditioner ,,," attachment two plots. The first plot is one of workfunction vs ecut and the second is a plot of the Fermi energy vs. ecut. It is obvious for ecut< 25 there is a sharp decrease in the workfunction and Fermi energy. These results are why I need to know whether these results are Valid for ecut < 25. If you don't know the answer, could you please point me to someone or some paper that could help.
You could also make a second plot with [(value(ecut)value(ecut_max))/value(ecut_max)]*100 vs ecut to have it like a percentage of error so that you can say you have a precision of 1% at ecut=XX, 10% at ecut=YY, etc. This will help you to decide which ecut you have to use at the end.
The same applies vs tolvrs.
You could test the convergence for tolvrs starting at 1E05 or 1E04 and see how it goes every 1E01.Lastly, I have not yet, but will try tonight, to compare the workfunction vs potential residual for vres = 10e9, 10, 11, 12 and 13 for say nstep = 500. ETOT converges for ecut < 42, so I will set ecut = 38 or 40.
You're welcomeThank you for your continuing assistance,
Best wishes,
Eric
Re: Parallelization and convergence
Eric,
I finished calculating the workfunction for values of 1d4 to 1d13. The results are at the end of the attached document. Included is a plot of the difference of the largest value of the workfunction, which occurs at tolvrs = 1d10, from each of the others. The work function appears to be independent of the values for tolvrs, unlike the behavior of the workfunction with respect to the number of plane waves. I will plot the difference and its percentage of ecut vs. WF tomorrow.
Eliezer
I finished calculating the workfunction for values of 1d4 to 1d13. The results are at the end of the attached document. Included is a plot of the difference of the largest value of the workfunction, which occurs at tolvrs = 1d10, from each of the others. The work function appears to be independent of the values for tolvrs, unlike the behavior of the workfunction with respect to the number of plane waves. I will plot the difference and its percentage of ecut vs. WF tomorrow.
Eliezer
 Attachments

 PreConditioner parameter evaluation.docx
 (54.07 KiB) Downloaded 13 times
Re: Parallelization and convergence
Dear Eliezer,
OK, we can thus see that you don't need such a strict precision on tolvrs to get correct precision on the property you want to compute (WF here) and asking for more would be CPU time waste. Now you can safely test the convergence w.r.t. ecut and kpt.
Best wishes,
Eric
OK, we can thus see that you don't need such a strict precision on tolvrs to get correct precision on the property you want to compute (WF here) and asking for more would be CPU time waste. Now you can safely test the convergence w.r.t. ecut and kpt.
Best wishes,
Eric
Re: Parallelization and convergence
Eric,
I'm sorry that I did not get back to you earlier but I spent the last week recalculating the workfunction of Pd so that there is no systematic error. The result is shown in Figure 1 of the attachment.
Unfortunately, the workfunction does not monotonically converge to some limit as a function of ecut as the Fermi energy does. My original goal was to determine this limit. Taking your suggestion, I plotted in Figure 2 the difference of the value of the workfunction for ecut = 60 from all the other values and plotted the result on a loglinear plot. You might notice that the values in Figure 2 for ecut = 26 snd 28 are missing because they are negative. Further, I computed the percent difference of each value of the workfunction from the value at ecut = 60 and plotted it in Figure 3.
The general characteristic in Figures 2 and 3 is that the difference in the workfunction values, as ecut increases, decreases.
Thanks again for your assistance and guidance.
Eliezer
I'm sorry that I did not get back to you earlier but I spent the last week recalculating the workfunction of Pd so that there is no systematic error. The result is shown in Figure 1 of the attachment.
Unfortunately, the workfunction does not monotonically converge to some limit as a function of ecut as the Fermi energy does. My original goal was to determine this limit. Taking your suggestion, I plotted in Figure 2 the difference of the value of the workfunction for ecut = 60 from all the other values and plotted the result on a loglinear plot. You might notice that the values in Figure 2 for ecut = 26 snd 28 are missing because they are negative. Further, I computed the percent difference of each value of the workfunction from the value at ecut = 60 and plotted it in Figure 3.
The general characteristic in Figures 2 and 3 is that the difference in the workfunction values, as ecut increases, decreases.
Thanks again for your assistance and guidance.
Eliezer
 Attachments

 Plots of the Pd workfunction vs ecut.docx
 (41.3 KiB) Downloaded 23 times
Re: Parallelization and convergence
Dear Eliezer,
OK, it sounds like you can get easily an error bar of 1% but going lower is CPU time demanding. At this stage you need to know if 1% precision on the WF is enough for you? If so, ecut=1620 look good.
There is something making a noise oscillation but I don't know what can be the origin, which pseupotentials are you using?
Best wishes,
Eric
PS: If you are using PAW, did you make pawecutdg evolving with ecut in your convergence tests?
OK, it sounds like you can get easily an error bar of 1% but going lower is CPU time demanding. At this stage you need to know if 1% precision on the WF is enough for you? If so, ecut=1620 look good.
There is something making a noise oscillation but I don't know what can be the origin, which pseupotentials are you using?
Best wishes,
Eric
PS: If you are using PAW, did you make pawecutdg evolving with ecut in your convergence tests?
Re: Parallelization and convergence
Eric
I was hoping to get an accuracy of 0.2% ~ 0.01eV, but this may not be practical.
I am using PBE GGA xc, ixc = 11 and pseudodojo pseudopotentials.
There is an oscillation in the tolvrs and toldfe values in the output file. Also, there is an oscillation in the vacuum portion of the 1DM files which decreases with increasing ecut (see the last two graphs in the attachment). I thought that the vacuum portion should rise to some flat region and then decrease.
Eliezer
I was hoping to get an accuracy of 0.2% ~ 0.01eV, but this may not be practical.
I am using PBE GGA xc, ixc = 11 and pseudodojo pseudopotentials.
There is an oscillation in the tolvrs and toldfe values in the output file. Also, there is an oscillation in the vacuum portion of the 1DM files which decreases with increasing ecut (see the last two graphs in the attachment). I thought that the vacuum portion should rise to some flat region and then decrease.
Eliezer
 Attachments

 Plots of the Pd workfunction vs ecut.docx
 (65.56 KiB) Downloaded 10 times
Re: Parallelization and convergence
Then, according to your convergence test, you have to use a large enough ecut to reach this precision...
Here it will be the limit of my help because I've never calculated work functions...
Did you see this document that might help you:
https://docs.abinit.org/guide/work_func ... orial.tex
Best wishes,
Eric
Re: Parallelization and convergence
Eric
Yes. This was the article along with a Phys Rev B article by Verstraete that allowed me to get started calculating work functions.
Many thanks for all your help and guidance!!
Sayonara
Eliezer
Yes. This was the article along with a Phys Rev B article by Verstraete that allowed me to get started calculating work functions.
Many thanks for all your help and guidance!!
Sayonara
Eliezer