[CP2K-user] [CP2K:20727] Re: out of memory: C3N4 (56 atoms) HES06 + RI-HFX + ADMM

yj jiang rudin.jiang at gmail.com
Mon Sep 23 07:37:22 UTC 2024


Dear Bussy, Stein, and other CP2K experts,

With your guidance, I have been able to run this job on a 512GB HPC. 
However, I still encounter some issues.

After running the calculations several times, I keep running into memory 
problems right after the first SCF iteration ends (upon diagonalizing the 
first Fock matrix). Could you please advise on where the memory consumption 
lies and how I can optimize it to reduce memory usage?

Moreover, could you please check if there are any other precision-related 
issues in this input file? I am particularly concerned about the settings 
in the INTERACTION_POTENTIAL section.

Thank you very much for your assistance!

Best regards,
Jiang

在2024年9月22日星期日 UTC+8 17:13:50<yj jiang> 写道:

> Deer Bussy
>
> Your suggestions have been extremely helpful. I am now able to run this 
> system on a 512GB HPC.
> The running speed is acceptable to me, considering the large amount of 
> computation involved. 
> Thank you very much for your help.
>
> Best wishes.
> Jiang
> 在2024年9月19日星期四 UTC+8 21:33:29<Augustin Bussy> 写道:
>
>> Hi Jiang,
>> I was able to run your input file on a 512G system, but I'd like to add a 
>> few comments:
>> 1) I somehow missed the fact that you are running RI-HFX with K-points. 
>> In such a case, the choice of the RI metric is not crucial for performance 
>> (or memory usage), and the best course of action is to keep the default.
>> 2) What really matters is the number of atoms in the unit cell, the 
>> diffuseness of the basis set, and the range of the HFX potential. There is 
>> not much you can do here, especially since you already use ADMM. The range 
>> of the HFX potential really matters, because this influences how many 
>> periodic images of your system need to be considered. Typically, this  has 
>> a limited impact for 2D systems, because there should only be images along 
>> the XY axes. However, if the extent of the cell in the Z direction is too 
>> short, periodic images get involved as well. *I found that increasing 
>> the Z cell parameters from 20 Ang to 30 Ang speeds up the calculations by a 
>> factor 2*, because less images are needed.
>> 3) Unless you reach a very high number of K-points (~1000), this has 
>> practically no impact on cost, and only a little on memory. You might as 
>> well use a denser grid than 3x3x1
>> 4) I am not sure if your system is a metal. If it is not, you should 
>> avoid using Broyden mixing and smearing, as SCF convergence slows down.
>> 5) While you can run this calculation on a single HPC node with 512G, 
>> this will be extremely slow. I highly recommand using more resources, and 
>> increasing the KP_NGROUPS keyword the the HF%RI section. This is not a 
>> small calculation you are attempting.
>>
>> I invite you to read the paper written for the method: 
>> https://doi.org/10.1063/5.0189659, this might help you understand some 
>> of the points above.
>> I hope that helps. Best,
>> Augustin
>> On Wednesday 18 September 2024 at 16:54:17 UTC+2 yj jiang wrote:
>>
>>> Thank you all very much for your help. A few days ago, I thought I had 
>>> uploaded the output file here, but it seems that due to the large file 
>>> size, the upload may not have been successful. I apologize for this.
>>>
>>> Yesterday, I seemed to have found the reason. The supercomputing job 
>>> system I am using has a memory allocation limit for each process, and this 
>>> issue is not exclusive to cp2k jobs. I am now planning to set the maximum 
>>> memory usage for cp2k to see if it resolves the problem.
>>>
>>> Thank you again for your help!
>>>
>>> Jiang
>>>
>>> 在2024年9月17日星期二 UTC+8 15:33:33<Augustin Bussy> 写道:
>>>
>>> Hi,
>>> everything that Frederick mentioned is very relevant. Additionally, I'd 
>>> like to add that the method was developed with small unit cells in mind. 
>>> Your cell is quite large, and this might cause memory issues, although 
>>> having a 2D material should help. I'll see if I am able to run it.
>>> Best,
>>> Augustin
>>>
>>> On Monday 16 September 2024 at 08:18:57 UTC+2 Frederick Stein wrote:
>>>
>>> Due to my lack of experience, I can just guess from my knowledge on RI 
>>> methods, the manual and what I was told by the developer. Please also 
>>> consider the respective manual page: 
>>> https://manual.cp2k.org/trunk/CP2K_INPUT/FORCE_EVAL/DFT/XC/HF/RI.html#CP2K_INPUT.FORCE_EVAL.DFT.XC.HF.RI 
>>> .
>>> - Change RI_METRIC to TRUNCATED and CUTOFF_RADIUS in the &RI section to 
>>> a value between 1.0 and 2.0 . Lower values improve sparsity, larger values 
>>> increase accuracy. The shortrange Coulomb metric has a rather large range. 
>>> If you would like to stick to it, set the OMEGA parameter of the &RI 
>>> section to a value larger then 0.11 (let's say 1-10).
>>> - Just for testing, try a single k-point.
>>> - Try other values of MEMORY_CUT in &RI. Larger values (try 10) should 
>>> decrease the block sizes. I do not really know how it works but it does 
>>> have an effect on memory usage.
>>> - Increase EPS_FILTER in &RI to a larger value (double-check later 
>>> whether you have to decrease it again).
>>> If nothing of my suggestions work and the developer of the feature does 
>>> not chime in, it may be that you do not have enough memory.
>>> Best,
>>> Frederick
>>>
>>>
>>> Frederick Stein schrieb am Sonntag, 15. September 2024 um 18:42:31 UTC+2:
>>>
>>> Hi,
>>> I am not an expert with RI-HFX but I would try to run potentially 
>>> cheaper settings first (single k-point, smaller basis sets, larger value of 
>>> omega, identity metric). It may also help to have an CP2K output file or 
>>> supplementary input/output file if you submit jobs with a scheduler like 
>>> Slurm. If there is no function stack in the output file(s) given (either by 
>>> the compiler or by CP2K), you could also add the keywords TRACE and 
>>> TRACE_MASTER to your GLOBAL section.
>>> Best,
>>> Frederick
>>>
>>> yj jiang schrieb am Sonntag, 15. September 2024 um 18:13:42 UTC+2:
>>>
>>> Hi, I'm calculating band structure for C3N4 with a Ni atom (attachment). 
>>> I want to use HES06 + RI-HFX + ADMM. There is a "out of memory" error. How 
>>> can I address this? Thx. a lot.
>>>
>>> I can access a hpc with max memory of 512G.
>>>
>>>

-- 
You received this message because you are subscribed to the Google Groups "cp2k" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cp2k+unsubscribe at googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/cp2k/38cd85e3-48c3-4870-a987-2746ae5cc31bn%40googlegroups.com.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.cp2k.org/archives/cp2k-user/attachments/20240923/f1998a98/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: c3n4-ni.zip
Type: application/x-zip
Size: 5926 bytes
Desc: not available
URL: <https://lists.cp2k.org/archives/cp2k-user/attachments/20240923/f1998a98/attachment-0001.bin>


More information about the CP2K-user mailing list