[CP2K-user] [CP2K:14265] Re: Hybrid functional calculation problem

Matt W mattwa... at gmail.com
Sun Nov 22 16:55:18 UTC 2020


Your input has

        &MEMORY
          MAX_MEMORY           4000
          EPS_STORAGE_SCALING  0.1
        &END MEMORY

This means that each MPI task (which can be multiple cores) should be able 
to allocate 4GBi of memory _exclusively_ for the 2 electron integrals.  If 
there is less than that available it will crash as the memory allocation 
can't occur. I guess your main cluster has less memory than the smaller 
one. You need to leave space for the operating system and the rest of the 
cp2k run besides the 2 electron integrals.

There is another thread where Juerg answers HFX memory in more detail form 
earlier this year.

Matt

On Sunday, November 22, 2020 at 4:42:47 PM UTC fa... at gmail.com wrote:

> Can cp2k access all the memory on the cluster? On linux you can use 
> ulimit -s unlimited
> to remove any limit on the amount of memory a process can use.
>
> I usually use SCREEN_ON_INITIAL_P. I found that for large systems it is 
> faster to run two energy minimizations with the key word enabled (such that 
> the second restarts from a converged PBE0 wfn) than running a single 
> minimization without SCREEN_ON_INITIAL_P. But that probably depends on the 
> system you simulate.
>
> You should converge the cutoff with respect to the properties that you are 
> interested in. Run a test system with increasing cutoff and look at, e.g. 
> the energy, pdos, etc.
>
> Number of sph. ERI's calculated on the fly:        4763533420139 
> This number should always be 0. If it is larger, increase the memory cp2k 
> has available.
>
> Fabian
> On Sunday, 22 November 2020 at 17:24:13 UTC+1 Lucas Lodeiro wrote:
>
>> Dear Fabian,
>>
>> Thanks for your advise. I forgot to clarify the time ejecution... my 
>> mistake. 
>> The calculation runs for 5 or 7 minutes, and stops... the walltime for 
>> the calculation was set as 72hrs, then I do not believe this is the 
>> problem. Now I am running the same input in a littler cluster (different 
>> form the problematic crash) with 64 proc and 250GB RAM, and the calculation 
>> works fine (so so slow, 9 hr per scf step, but runs... the total RAM 
>> assigned for the ERI's is not sufficient but the problem is not appear)... 
>> It is no practical to use this little cluster, then I need to fix the 
>> problem in the big one, to use more RAM and more processors (more than 220 
>> it is possible), but as the program does not show what is happening, I 
>> cannot tell anything to the cluster admin to recompile or fix the problem. 
>> :(
>>
>> This is the output in the little cluster:
>>
>>   Step     Update method      Time    Convergence         Total energy   
>>  Change
>>   
>> ------------------------------------------------------------------------------
>>
>>   HFX_MEM_INFO| Est. max. program size before HFX [MiB]:                 
>>    1371
>>
>>  *** WARNING in hfx_energy_potential.F:605 :: The Kohn Sham matrix is not 
>>  ***
>>
>>  *** 100% occupied. This may result in incorrect Hartree-Fock results. 
>> Try ***
>>  *** to decrease EPS_PGF_ORB and EPS_FILTER_MATRIX in the QS section. For 
>>  ***
>>  *** more information see FAQ: https://www.cp2k.org/faq:hfx_eps_warning 
>>    ***
>>
>>   HFX_MEM_INFO| Number of cart. primitive ERI's calculated:       
>> 27043173676632
>>   HFX_MEM_INFO| Number of sph. ERI's calculated:                   
>> 4879985997918
>>   HFX_MEM_INFO| Number of sph. ERI's stored in-core:               
>>  116452577779
>>   HFX_MEM_INFO| Number of sph. ERI's stored on disk:                     
>>       0
>>   HFX_MEM_INFO| Number of sph. ERI's calculated on the fly:       
>>  4763533420139
>>   HFX_MEM_INFO| Total memory consumption ERI's RAM [MiB]:                 
>> 143042
>>   HFX_MEM_INFO| Whereof max-vals [MiB]:                                   
>>   1380
>>   HFX_MEM_INFO| Total compression factor ERI's RAM:                       
>>   6.21
>>   HFX_MEM_INFO| Total memory consumption ERI's disk [MiB]:               
>>       0
>>   HFX_MEM_INFO| Total compression factor ERI's disk:                     
>>    0.00
>>   HFX_MEM_INFO| Size of density/Fock matrix [MiB]:                       
>>     266
>>   HFX_MEM_INFO| Size of buffers [MiB]:                                   
>>      98
>>   HFX_MEM_INFO| Number of periodic image cells considered:               
>>       5
>>   HFX_MEM_INFO| Est. max. program size after HFX  [MiB]:                 
>>    3834
>>
>>      1 NoMix/Diag. 0.40E+00 ******     5.46488333    -20625.2826573514 
>> -2.06E+04
>>
>> About the SCREEN_ON_INITIAL_P, I read that to use it, you need a very 
>> good guess (more than de GGA converged one) as for example the last step or 
>> frame from a GEO_OPT or MD... Is it really useful when the guess is the GGA 
>> wavefunction?
>> About the CUTOFF_RADIUS, I read that 6 or 7 it is a good compromise, and 
>> as my cell is approximately twice, I use the minimal image convention to 
>> decide the 8.62 number which is near the recomended (6 or 7). If I reduce 
>> it, does the computational cost diminish considerably?
>>
>> Regards - Lucas
>>
>> El dom, 22 nov 2020 a las 12:53, fa... at gmail.com (<fa... at gmail.com>) 
>> escribió:
>>
>>> Dear Lucas,
>>>
>>> cp2k was computes the four-center integrals during (or prior) to the 
>>> first SCF cycle. I assume the job ran out of time during this task  For a 
>>> system with more than 1000 atoms this takes a lot of time. With only 220 
>>> CPUs this could take several hours in fact.
>>>
>>> To speed up the calculation you should use SCREEN_ON_INITIAL_P T and 
>>> restart using a well converged PBE wfn. Other than that, there is little 
>>> you can do other than assign the job more time and/or CPUs. (Of course, 
>>> reducing CUTOFF_RADIUS        8.62 would help too but could negatively 
>>> affect the result).
>>>
>>> Cheers,
>>> Fabian
>>>
>>> On Sunday, 22 November 2020 at 01:21:05 UTC+1 Lucas Lodeiro wrote:
>>>
>>>> Hi all, 
>>>> I need to perform a hybrid calculation with CP2K7.1, over a big system 
>>>> (+1000 atoms). I study the manual, the tutorials and some videos of CP2K 
>>>> developers to improve my input. But the program exits the calculation when 
>>>> the HF part is running... I see the memory usage on the fly, and there is 
>>>> no peak which explains the fail (I used 4000Mb with 220 processors).
>>>> The output does not show some explanation... Thinking in the memory, I 
>>>> try with a largemem node at our cluster, using 15000Mb with 220 processors, 
>>>> but the program exists at the same point without message, just killing the 
>>>> process. 
>>>> The output shows a warning:
>>>>
>>>>  *** WARNING in hfx_energy_potential.F:591 :: The Kohn Sham matrix is 
>>>> not  ***
>>>>  *** 100% occupied. This may result in incorrect Hartree-Fock results. 
>>>> Try ***
>>>>  *** to decrease EPS_PGF_ORB and EPS_FILTER_MATRIX in the QS section. 
>>>> For  ***
>>>>  *** more information see FAQ: https://www.cp2k.org/faq:hfx_eps_warning 
>>>>    ***
>>>>
>>>> but I read this is not a very complicated issue, and the calculation 
>>>> has to continue and not crash
>>>> Also I decrease the EPS__PGF_ORB, but the warning and the problem 
>>>> persist. 
>>>>
>>>> I do not know if the problem could be located in other parts of my 
>>>> input... for example I use the PBE0-T_C-LR (I use PBC for XY), and ADMM. In 
>>>> the ADMM options, I use ADMM_PURIFICATION_METHOD = NONE, due to I read that 
>>>> ADMM1 is the only one useful for smearing calculations. 
>>>>
>>>> I run this system with PBE (for the first guess of PBE0), and there is 
>>>> no problem in that case.
>>>> Moreover, I try with other CP2K versions (7.0, 6.1 and 5.1) compiled 
>>>> into the cluster with (libint_max_am=6), and the calculation crash, but 
>>>> show this problem:
>>>>
>>>>
>>>>  *******************************************************************************
>>>>  *   ___                                                               
>>>>         *
>>>>  *  /   \                                                               
>>>>        *
>>>>  * [ABORT]                                                             
>>>>         *
>>>>  *  \___/       CP2K and libint were compiled with different 
>>>> LIBINT_MAX_AM.    *
>>>>  *    |                                                                 
>>>>        *
>>>>  *  O/|                                                                 
>>>>        *
>>>>  * /| |                                                                 
>>>>        *
>>>>  * / \                                               
>>>>  hfx_libint_wrapper.F:134 *
>>>>
>>>>  *******************************************************************************
>>>>
>>>>
>>>>  ===== Routine Calling Stack ===== 
>>>>
>>>>             2 hfx_create
>>>>             1 CP2K
>>>>
>>>> It seems like this problem is not present in the 7.1 version, as the 
>>>> program does not show it, and the compilation information does not 
>>>> show LIBINT_MAX_AM value...
>>>>
>>>> If somebody could give me some advice, I will appreciate it. :)
>>>> I attach the input file, and the output file for 7.1 version.
>>>>
>>>> Regards - Lucas Lodeiro
>>>>
>>>> -- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "cp2k" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to cp... at googlegroups.com.
>>> To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/cp2k/96479ce2-d8a3-4ccf-b55c-0e935878f1c0n%40googlegroups.com 
>>> <https://groups.google.com/d/msgid/cp2k/96479ce2-d8a3-4ccf-b55c-0e935878f1c0n%40googlegroups.com?utm_medium=email&utm_source=footer>
>>> .
>>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.cp2k.org/archives/cp2k-user/attachments/20201122/3845c4e4/attachment.htm>


More information about the CP2K-user mailing list