Convergence rates for k-point grid over different planewave cutoffs

Moderators: jzwanzig, jolafc

Post Reply
jbde
Posts: 3
Joined: Mon Jun 22, 2020 6:54 pm

Convergence rates for k-point grid over different planewave cutoffs

Post by jbde » Tue Aug 04, 2020 6:31 pm

I'm trying to understand the convergence behaviour of some ground state calculations, and was hoping that someone here might be able to assist (even if just to point out my obvious mistakes!), or perhaps refer me to answers in the literature.

In particular, I'm interested in the general process of selecting values for the planewave cutoff (e_cut) and k-point grid parameters. Although it's very common to read statements like "the calculation should be converged with respect to the number of k-points" in tutorials and documentation, I've spent quite some time investigating this and have found the literature to be surprisingly sparse on rigorous detail.

For the sake of what I'm about to write, I assume that parameters relate to the calculation as described in page 8 of these slides. To be more precise, the choice of k-point grid prescribes the terms of the outer "integral", in fact a summation (over k), and the choice of planewave cutoff e_cut defines the points g used for the inner summation.

As this is ultimately a quadrature approach, I would expect that increasing the fineness of the k-point grid without increasing the planewave cutoff should be "less important", in the sense that no matter how accurate the approximation to the integral of the sum of the planewave terms becomes, it will still be limited in "true" accuracy by the quality of that interior sum. This suggests to me that a sensible approach to obtaining a calculation accurate to some epsilon would be to first converge the planewave cutoff until the delta in total energy falls below that epsilon, and then increase the fineness of the k-point grid, again until the delta falls below the same epsilon. The experiments I've done seem to produce "sensible" results here, but I'm not entirely confident in the method. Does this make sense?

Nevertheless, having done this, I've noticed in particular that the convergence in increasing density of k-point grids seems to be quite dependent on the planewave cutoff. More specifically, calculations with low planewave cutoffs seem to converge more slowly in k-point grid size than do those with high planewave cutoffs. I find this to be a little surprising! Again, my intuition suggests that convergence rates for an increasing-in-fineness sequence of quadrature grids applied to functions (in this case, the sum of the planewave terms) with more-or-less the same smoothness should be very similar.

To illustrate my point, I've attached a dataset run, which generates total energies for bulk silicon for the same increasing sequence of k-point grid sizes (gamma-point centred Monkhorst Pack, points per side 1, 2, 4, 8, 16, 32, 64), taken at two different values of ecut (100 eV and 500 eV). I used the standard JTH PAW pseudopotentials, with double-grid cutoffs taken as twice the planewave cutoff, so 200 eV and 1000 eV respectively. The results fall out as:

Code: Select all

           etotal11   -7.3274128604E+00
           etotal12   -7.8649778973E+00
           etotal13   -7.9518906162E+00
           etotal14   -7.9582974662E+00
           etotal15   -7.9584428547E+00
           etotal16   -7.9584504688E+00
           etotal17   -7.9584557575E+00
           etotal21   -7.3539551058E+00
           etotal22   -7.8809892419E+00
           etotal23   -7.9666352650E+00
           etotal24   -7.9732760505E+00
           etotal25   -7.9734171428E+00
           etotal26   -7.9734174432E+00
           etotal27   -7.9734174422E+00
Note that the set of etotal2* values are converged to about 1.0e-06 Ha for an 8 x 8 x 8 grid, while in the set of etotal1* values, the same grid size is only accurate to about 1.0e-03.

Is anybody able to offer an explanation for this? I have read, for example, the various canonical papers from Monkhorst and Pack and so on, and nothing in them jumps out to me as an immediate reason for this. Perhaps I'm missing something very obvious?

(Please note that I do understand that this calculation is dubious for real-world use, and could be made more efficient through the use of, e.g., different shifts for the k-point grid -- I'm presenting it as a minimal working example.)
Attachments
test.in
(1.07 KiB) Downloaded 6 times

ebousquet
Posts: 434
Joined: Tue Apr 19, 2011 11:13 am
Location: University of Liege, Belgium

Re: Convergence rates for k-point grid over different planewave cutoffs

Post by ebousquet » Wed Aug 12, 2020 1:44 pm

Dear jbde,
I agree with you that the literature concerning convergences of DFT calculations is quite sparse while it is the basics before going to "production". The problem is quite simple (converging the results w.r.t. numerical parameters) but its application can be more subtle than what is often presented.
You can see a related discussion I had in this post:
viewtopic.php?f=8&t=4380&hilit=work+function
The convergence can be done on the total energy but, to me, while it can give an overall idea of the convergence of the calculation, it does not guaranty that the physical property you are looking at is converged or, the other way around, the total energy convergence crieterion can give overconvergence than what is necessary for the physical property of interest (resulting in a waste of CPU time). This is because most of the physical properties depend on energy differences rather than on the total energy itself such that the difference of energy can have a different convergence behavior than the total energy. This is strikingly true in the case where the system of study is close to a phase transition/critical point. So, if you are looking at the structures, do a convergence study on the cell parameters and atomic positions, if you are looking at phonons, do the convergence study on the phonons, etc. Another aspect is that it depends which precision you need on the physical property of interest, for example phonons are OK with a precision of +/- 1 or 2 cm-1 but if you are studying a specific phonon that vary within a few cm-1 (for example reported from the experiments) then you need to go beyond a precision of 1cm-1 to have the right accuracy to compare with the experiments.

So my conclusion is that if you report a physical property, says P, with a given number of digits for a given unit, e.g. P=13.34 a.u., then you have to ensure that your DFT calculations guaranty this level of precision (i.e. up to the number of digit you give), in my example P should be converged at more or less at +/- 0.01 a.u.. It is clear here that if you need P at +/- 0.1 then this will require less ecut and k-points to reach this lower precision but it is fine with what you need. It is also evident that if you do converge study on the total energy only, how can you know about the precision you have on P?

Coming back to your tests, the fact that we have more than one numerical parameter to test complicate a bit the situation. If you start with ecut=100 eV, then you'll have. large noise from the number of plane-waves that is not enough and this can reflects on the convergence test w.r.t. k-points by imposing a larger noise than what you would have with a larger ecut.

Hope this can help clarifying your questions.
Best wishes,
Eric

jbde
Posts: 3
Joined: Mon Jun 22, 2020 6:54 pm

Re: Convergence rates for k-point grid over different planewave cutoffs

Post by jbde » Wed Aug 12, 2020 7:24 pm

Hi Eric,

Thanks for the response! Your points about the necessity of considering accuracy in the final application are well taken. Could you perhaps elaborate on your comment about "phase transition[s]/critical point[s]"? Is it generally the case in such situations that total energy underconverges, or that it overconverges? Or is it just generally "chaos" and convergence/accuracy of any properties in such regimes should be checked on a case-by-case basis? :)

Thanks also for the tip re: the noise inherent in a low E_cut. I had assumed it would be something like that, although I'm still a bit fuzzy as to how exactly. I can certainly imagine cases where quadrature over functions approximated with low-order Fourier expansions produces garbage results, although it's a bit harder for me to imagine them in this particular case... The linear combinations of planewaves don't seem to me to be particularly pathological, even if the smoothness of the underlying wavefunction being approximated seems to be an open topic...

Maybe I should explain my purpose in the questions -- I'm a mathematician by trade rather than a chemist or materials scientist, and I'm investigating the application of (or perhaps better said, trying to find applications of) some extrapolation techniques to/in planewave DFT results. So I don't have particular experimental or theoretical cases that I'm trying to replicate -- instead, I'm generally interested in convergence rates for their own sake, and associated behaviour properties of the various algorithms under the bonnet. Focusing on total energy seems as good a starting point as any.

To rephrase my original question, in a hope of keeping a discussion going -- regardless of the property under consideration, it seems to me still that planewave cutoff and k-point mesh are the two main "knobs" available for controlling convergence/accuracy. I assume that regardless of the property under consideration, that there must be an "optimal" setting in terms of a tradeoff between accuracy and computational cost. Surely it would be useful to find this in a case such as, say, a geometry relaxation, where almost the same calculation must be performed repeatedly, and where the savings in cost could accumulate? Are you aware of any literature discussing the problem?

Thanks once again!

James

ebousquet
Posts: 434
Joined: Tue Apr 19, 2011 11:13 am
Location: University of Liege, Belgium

Re: Convergence rates for k-point grid over different planewave cutoffs

Post by ebousquet » Wed Aug 19, 2020 10:47 am

Hi James,
jbde wrote:
Wed Aug 12, 2020 7:24 pm
Thanks for the response! Your points about the necessity of considering accuracy in the final application are well taken. Could you perhaps elaborate on your comment about "phase transition[s]/critical point[s]"? Is it generally the case in such situations that total energy underconverges, or that it overconverges? Or is it just generally "chaos" and convergence/accuracy of any properties in such regimes should be checked on a case-by-case basis? :)
Regarding the phase transition, if we suppose a simple Landau internal energy expression (and keeping my parameter P as the order parameter of the phase transition) U=aP^2+bP^4, then a phase transition will occur when a change sign. When a-->0 (close to the phase transition) the energy landscape is going to flatten around P=0, such that even large change of P will induce very small change of the energy. In this condition, the precision required through "numerization" is going to be much more demanding than if a>>.
jbde wrote:
Wed Aug 12, 2020 7:24 pm
Thanks also for the tip re: the noise inherent in a low E_cut. I had assumed it would be something like that, although I'm still a bit fuzzy as to how exactly. I can certainly imagine cases where quadrature over functions approximated with low-order Fourier expansions produces garbage results, although it's a bit harder for me to imagine them in this particular case... The linear combinations of planewaves don't seem to me to be particularly pathological, even if the smoothness of the underlying wavefunction being approximated seems to be an open topic...
I agree, it would be interesting to have a closer look at what is happening exactly. It is possible that in the old seminal papers, when people were developing DFT codes, that you have such studies, I'll check if I can find something (or you can ask directly to these people that are still active like Richard Martin or Don Hamann).
jbde wrote:
Wed Aug 12, 2020 7:24 pm
Maybe I should explain my purpose in the questions -- I'm a mathematician by trade rather than a chemist or materials scientist, and I'm investigating the application of (or perhaps better said, trying to find applications of) some extrapolation techniques to/in planewave DFT results. So I don't have particular experimental or theoretical cases that I'm trying to replicate -- instead, I'm generally interested in convergence rates for their own sake, and associated behaviour properties of the various algorithms under the bonnet. Focusing on total energy seems as good a starting point as any.
OK, great project :)! Hence, you can indeed focus on the total energy and chose a residual reference like 1 meV as good reference as most of the convergence tuto are using.
jbde wrote:
Wed Aug 12, 2020 7:24 pm
To rephrase my original question, in a hope of keeping a discussion going -- regardless of the property under consideration, it seems to me still that planewave cutoff and k-point mesh are the two main "knobs" available for controlling convergence/accuracy. I assume that regardless of the property under consideration, that there must be an "optimal" setting in terms of a tradeoff between accuracy and computational cost. Surely it would be useful to find this in a case such as, say, a geometry relaxation, where almost the same calculation must be performed repeatedly, and where the savings in cost could accumulate? Are you aware of any literature discussing the problem?
I agree that ecut and k-points are the two core convergence parameters of DFT. If you want to find an optimal setting that is as general as possible, I think you'll have hard time to have it fully general but I might be wrong. I guess you probably can in crystals with large band gaps on one side or metallic crystals on another side but you can always find special cases that will not follow the rules, as the ones I mentioned. Regarding literature, I think you probably can find something on the highthroughput publications where they have faced convergence problems at some points.

Hope this can help,
Eric

Post Reply