[CP2K-user] [CP2K:14273] Re: Hybrid functional calculation problem

Lucas Lodeiro eluni... at gmail.com
Tue Nov 24 05:52:23 UTC 2020
Previous message (by thread): [CP2K-user] [CP2K:14269] Re: Hybrid functional calculation problem
Next message (by thread): [CP2K-user] [CP2K:14273] Re: Hybrid functional calculation problem
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Thanks for your advices!

Now I can run it at least, It is so slow but run. The difference between
the little and big cluster was that, in the little one, the total RAM
consumption is practically MPI_PROCESS*(Baseline + MAX_MEMORY + 2 full
matrices), as Prof. Hutter explains, but in the big one,  there are some
cluster process which consume 5 or 10% of each nodes... then I had to
optimize the MAX_MEMORY doing some test...

About the ERIs, it is so difficult to have 7 TB for them... I can take 4 TB
without problem... But to take the whole cluster section is difficult. I
try using the SCREENING option to speed it up, taking some ERIs on the fly.

Regards - Lucas Lodeiro


El lun, 23 nov 2020 a las 14:18, fa... at gmail.com (<fabia... at gmail.com>)
escribió:

> Your graph nicely shows that cp2k runs out of memory. As Matt wrote, you
> have to decrease MAX_MEMORY to allow enough memory for the rest of the
> programm. Here are some details on memory consumption with HF:
> https://groups.google.com/g/cp2k/c/DZDVTIORyVY/m/OGjJDJuqBwAJ
>
> Of course you can recalculate some the ERI's in each SCF cycle. But that
> slows down the minimization by a lot, I'd advise against doing that. Try to
> use screening, set a proper value for MAX_MEMORY, and use all the resources
> you have to store the ERI's
>
> Fabian
> On Sunday, 22 November 2020 at 23:08:17 UTC+1 Lucas Lodeiro wrote:
>
>> Hi Fabian and Matt,
>>
>> About the access to the memory, I ran calculations without problems for
>> months, using 90% of the node RAM without problems. But to check I set
>> ulimit -s unlimited. There are some changes, before using ulimit, the
>> calculation crashes and the use of RAM was so low (15%), after using
>> ulimit, the calculation crashes equally, but the use of RAM shows a
>> sustained rise to the limit and then the calculation crashes. This is a
>> change. I adjunct an image.
>>
>> About the SCREEN_ON_INITIAL_P, I will use it in the little cluster. I
>> like the idea of running 2 calculations as climbing steps.
>>
>> I know that the number of the ERIs calculated on the fly should be 0, and
>> if it is different from zero, I need to use more RAM to store them and to
>> not calculate them at each scf step. But in the case of the little cluster,
>> I am using all processors and RAM resources.  But the way, the
>> calculation runs without problems when ERIs calculated on the fly at each
>> scf step, just is very slow.
>>
>> About what Matt comments. In the little cluster, I have a single node
>> with 250GB RAM. Then I use MAX_MEMORY = 2600, this is a total of 166.4 GB
>> for the ERIS (the output informs 143 GB), and the rest for the whole
>> program.
>> In the case of the big cluster, we have access to many nodes with 44 proc
>> and 192GB RAM, and 9 nodes with 44 proc and 768GB RAM. In the first case, I
>> use 5 nodes (220 proc) using all memory (960GB), setting MAX_MEMORY = 4000
>> (4.0 GB * 220 proc = 880 GB RAM for ERIs). In the second case, I use 5
>> nodes (220 proc) using all memory (3840GB), setting MAX_MEMORY = 15000
>> (15.0 GB * 220 proc = 3300 GB RAM for ERIs).
>> In both cases the calculation crashes... I do not know if I am
>> so credulous , but 3.3 TB of RAM seems, at least, enough to store so many
>> of the ERIs...
>>
>> Using the data informed in the output of little cluster:
>>   HFX_MEM_INFO| Number of sph. ERI's calculated:
>> 4879985997918
>>   HFX_MEM_INFO| Number of sph. ERI's stored in-core:
>>  116452577779
>>   HFX_MEM_INFO| Number of sph. ERI's stored on disk:
>>       0
>>   HFX_MEM_INFO| Number of sph. ERI's calculated on the fly:
>>  4763533420139
>>
>> The stored ERI's are the 1/42 of the total ERIs, and use 166.4 GB (143 GB
>> informed)... Then if I want to store all of them, I need 166.4 GB * 42 =
>> ~7.0 TB... Is that correct?
>> I can get 7.0 TB RAM using 9 nodes with 768 GB RAM each one. But I am not
>> so clear about the idea that the amount of RAM is the problem, because in
>> the little cluster it runs, calculating almost all ERIs at each scf step...
>>
>> I am a little surprised why the calculation runs in the little cluster,
>> but not in the big one.
>> Do you guess some other related problem?
>>
>> Regards - Lucas
>>
>>
>>
>> El dom, 22 nov 2020 a las 13:55, Matt W (<mat... at gmail.com>) escribió:
>>
>>> Your input has
>>>
>>>         &MEMORY
>>>           MAX_MEMORY           4000
>>>           EPS_STORAGE_SCALING  0.1
>>>         &END MEMORY
>>>
>>> This means that each MPI task (which can be multiple cores) should be
>>> able to allocate 4GBi of memory _exclusively_ for the 2 electron
>>> integrals.  If there is less than that available it will crash as the
>>> memory allocation can't occur. I guess your main cluster has less memory
>>> than the smaller one. You need to leave space for the operating system and
>>> the rest of the cp2k run besides the 2 electron integrals.
>>>
>>> There is another thread where Juerg answers HFX memory in more detail
>>> form earlier this year.
>>>
>>> Matt
>>>
>>> On Sunday, November 22, 2020 at 4:42:47 PM UTC fa... at gmail.com wrote:
>>>
>>>> Can cp2k access all the memory on the cluster? On linux you can use
>>>> ulimit -s unlimited
>>>> to remove any limit on the amount of memory a process can use.
>>>>
>>>> I usually use SCREEN_ON_INITIAL_P. I found that for large systems it is
>>>> faster to run two energy minimizations with the key word enabled (such that
>>>> the second restarts from a converged PBE0 wfn) than running a single
>>>> minimization without SCREEN_ON_INITIAL_P. But that probably depends on the
>>>> system you simulate.
>>>>
>>>> You should converge the cutoff with respect to the properties that you
>>>> are interested in. Run a test system with increasing cutoff and look at,
>>>> e.g. the energy, pdos, etc.
>>>>
>>>> Number of sph. ERI's calculated on the fly:        4763533420139
>>>> This number should always be 0. If it is larger, increase the memory
>>>> cp2k has available.
>>>>
>>>> Fabian
>>>> On Sunday, 22 November 2020 at 17:24:13 UTC+1 Lucas Lodeiro wrote:
>>>>
>>>>> Dear Fabian,
>>>>>
>>>>> Thanks for your advise. I forgot to clarify the time ejecution... my
>>>>> mistake.
>>>>> The calculation runs for 5 or 7 minutes, and stops... the walltime for
>>>>> the calculation was set as 72hrs, then I do not believe this is the
>>>>> problem. Now I am running the same input in a littler cluster (different
>>>>> form the problematic crash) with 64 proc and 250GB RAM, and the calculation
>>>>> works fine (so so slow, 9 hr per scf step, but runs... the total RAM
>>>>> assigned for the ERI's is not sufficient but the problem is not appear)...
>>>>> It is no practical to use this little cluster, then I need to fix the
>>>>> problem in the big one, to use more RAM and more processors (more than 220
>>>>> it is possible), but as the program does not show what is happening, I
>>>>> cannot tell anything to the cluster admin to recompile or fix the problem.
>>>>> :(
>>>>>
>>>>> This is the output in the little cluster:
>>>>>
>>>>>   Step     Update method      Time    Convergence         Total energy
>>>>>    Change
>>>>>
>>>>> ------------------------------------------------------------------------------
>>>>>
>>>>>   HFX_MEM_INFO| Est. max. program size before HFX [MiB]:
>>>>>      1371
>>>>>
>>>>>  *** WARNING in hfx_energy_potential.F:605 :: The Kohn Sham matrix is
>>>>> not  ***
>>>>>
>>>>>  *** 100% occupied. This may result in incorrect Hartree-Fock results.
>>>>> Try ***
>>>>>  *** to decrease EPS_PGF_ORB and EPS_FILTER_MATRIX in the QS section.
>>>>> For  ***
>>>>>  *** more information see FAQ:
>>>>> https://www.cp2k.org/faq:hfx_eps_warning    ***
>>>>>
>>>>>   HFX_MEM_INFO| Number of cart. primitive ERI's calculated:
>>>>> 27043173676632
>>>>>   HFX_MEM_INFO| Number of sph. ERI's calculated:
>>>>> 4879985997918
>>>>>   HFX_MEM_INFO| Number of sph. ERI's stored in-core:
>>>>>  116452577779
>>>>>   HFX_MEM_INFO| Number of sph. ERI's stored on disk:
>>>>>         0
>>>>>   HFX_MEM_INFO| Number of sph. ERI's calculated on the fly:
>>>>>  4763533420139
>>>>>   HFX_MEM_INFO| Total memory consumption ERI's RAM [MiB]:
>>>>>     143042
>>>>>   HFX_MEM_INFO| Whereof max-vals [MiB]:
>>>>>       1380
>>>>>   HFX_MEM_INFO| Total compression factor ERI's RAM:
>>>>>       6.21
>>>>>   HFX_MEM_INFO| Total memory consumption ERI's disk [MiB]:
>>>>>         0
>>>>>   HFX_MEM_INFO| Total compression factor ERI's disk:
>>>>>      0.00
>>>>>   HFX_MEM_INFO| Size of density/Fock matrix [MiB]:
>>>>>       266
>>>>>   HFX_MEM_INFO| Size of buffers [MiB]:
>>>>>        98
>>>>>   HFX_MEM_INFO| Number of periodic image cells considered:
>>>>>         5
>>>>>   HFX_MEM_INFO| Est. max. program size after HFX  [MiB]:
>>>>>      3834
>>>>>
>>>>>      1 NoMix/Diag. 0.40E+00 ******     5.46488333    -20625.2826573514
>>>>> -2.06E+04
>>>>>
>>>>> About the SCREEN_ON_INITIAL_P, I read that to use it, you need a very
>>>>> good guess (more than de GGA converged one) as for example the last step or
>>>>> frame from a GEO_OPT or MD... Is it really useful when the guess is the GGA
>>>>> wavefunction?
>>>>> About the CUTOFF_RADIUS, I read that 6 or 7 it is a good compromise,
>>>>> and as my cell is approximately twice, I use the minimal image convention
>>>>> to decide the 8.62 number which is near the recomended (6 or 7). If I
>>>>> reduce it, does the computational cost diminish considerably?
>>>>>
>>>>> Regards - Lucas
>>>>>
>>>>> El dom, 22 nov 2020 a las 12:53, fa... at gmail.com (<
>>>>> fa... at gmail.com>) escribió:
>>>>>
>>>>>> Dear Lucas,
>>>>>>
>>>>>> cp2k was computes the four-center integrals during (or prior) to the
>>>>>> first SCF cycle. I assume the job ran out of time during this task  For a
>>>>>> system with more than 1000 atoms this takes a lot of time. With only 220
>>>>>> CPUs this could take several hours in fact.
>>>>>>
>>>>>> To speed up the calculation you should use SCREEN_ON_INITIAL_P T and
>>>>>> restart using a well converged PBE wfn. Other than that, there is little
>>>>>> you can do other than assign the job more time and/or CPUs. (Of course,
>>>>>> reducing CUTOFF_RADIUS        8.62 would help too but could negatively
>>>>>> affect the result).
>>>>>>
>>>>>> Cheers,
>>>>>> Fabian
>>>>>>
>>>>>> On Sunday, 22 November 2020 at 01:21:05 UTC+1 Lucas Lodeiro wrote:
>>>>>>
>>>>>>> Hi all,
>>>>>>> I need to perform a hybrid calculation with CP2K7.1, over a big
>>>>>>> system (+1000 atoms). I study the manual, the tutorials and some videos of
>>>>>>> CP2K developers to improve my input. But the program exits the calculation
>>>>>>> when the HF part is running... I see the memory usage on the fly, and there
>>>>>>> is no peak which explains the fail (I used 4000Mb with 220 processors).
>>>>>>> The output does not show some explanation... Thinking in the memory,
>>>>>>> I try with a largemem node at our cluster, using 15000Mb with 220
>>>>>>> processors, but the program exists at the same point without message, just
>>>>>>> killing the process.
>>>>>>> The output shows a warning:
>>>>>>>
>>>>>>>  *** WARNING in hfx_energy_potential.F:591 :: The Kohn Sham matrix
>>>>>>> is not  ***
>>>>>>>  *** 100% occupied. This may result in incorrect Hartree-Fock
>>>>>>> results. Try ***
>>>>>>>  *** to decrease EPS_PGF_ORB and EPS_FILTER_MATRIX in the QS
>>>>>>> section. For  ***
>>>>>>>  *** more information see FAQ:
>>>>>>> https://www.cp2k.org/faq:hfx_eps_warning    ***
>>>>>>>
>>>>>>> but I read this is not a very complicated issue, and the calculation
>>>>>>> has to continue and not crash
>>>>>>> Also I decrease the EPS__PGF_ORB, but the warning and the problem
>>>>>>> persist.
>>>>>>>
>>>>>>> I do not know if the problem could be located in other parts of my
>>>>>>> input... for example I use the PBE0-T_C-LR (I use PBC for XY), and ADMM. In
>>>>>>> the ADMM options, I use ADMM_PURIFICATION_METHOD = NONE, due to I read that
>>>>>>> ADMM1 is the only one useful for smearing calculations.
>>>>>>>
>>>>>>> I run this system with PBE (for the first guess of PBE0), and there
>>>>>>> is no problem in that case.
>>>>>>> Moreover, I try with other CP2K versions (7.0, 6.1 and 5.1) compiled
>>>>>>> into the cluster with (libint_max_am=6), and the calculation crash, but
>>>>>>> show this problem:
>>>>>>>
>>>>>>>
>>>>>>>  *******************************************************************************
>>>>>>>  *   ___
>>>>>>>           *
>>>>>>>  *  /   \
>>>>>>>            *
>>>>>>>  * [ABORT]
>>>>>>>           *
>>>>>>>  *  \___/       CP2K and libint were compiled with different
>>>>>>> LIBINT_MAX_AM.    *
>>>>>>>  *    |
>>>>>>>            *
>>>>>>>  *  O/|
>>>>>>>            *
>>>>>>>  * /| |
>>>>>>>            *
>>>>>>>  * / \
>>>>>>>  hfx_libint_wrapper.F:134 *
>>>>>>>
>>>>>>>  *******************************************************************************
>>>>>>>
>>>>>>>
>>>>>>>  ===== Routine Calling Stack =====
>>>>>>>
>>>>>>>             2 hfx_create
>>>>>>>             1 CP2K
>>>>>>>
>>>>>>> It seems like this problem is not present in the 7.1 version, as the
>>>>>>> program does not show it, and the compilation information does not
>>>>>>> show LIBINT_MAX_AM value...
>>>>>>>
>>>>>>> If somebody could give me some advice, I will appreciate it. :)
>>>>>>> I attach the input file, and the output file for 7.1 version.
>>>>>>>
>>>>>>> Regards - Lucas Lodeiro
>>>>>>>
>>>>>>> --
>>>>>> You received this message because you are subscribed to the Google
>>>>>> Groups "cp2k" group.
>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>> send an email to cp... at googlegroups.com.
>>>>>> To view this discussion on the web visit
>>>>>> https://groups.google.com/d/msgid/cp2k/96479ce2-d8a3-4ccf-b55c-0e935878f1c0n%40googlegroups.com
>>>>>> <https://groups.google.com/d/msgid/cp2k/96479ce2-d8a3-4ccf-b55c-0e935878f1c0n%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>> .
>>>>>>
>>>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "cp2k" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to cp... at googlegroups.com.
>>>
>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/cp2k/aa6b0a55-9d21-4da6-a3bb-f6f62ea0768bn%40googlegroups.com
>>> <https://groups.google.com/d/msgid/cp2k/aa6b0a55-9d21-4da6-a3bb-f6f62ea0768bn%40googlegroups.com?utm_medium=email&utm_source=footer>
>>> .
>>>
>> --
> You received this message because you are subscribed to the Google Groups
> "cp2k" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to cp... at googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/cp2k/87e96bc0-fa01-43ee-9d95-635a45db2d41n%40googlegroups.com
> <https://groups.google.com/d/msgid/cp2k/87e96bc0-fa01-43ee-9d95-635a45db2d41n%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.cp2k.org/archives/cp2k-user/attachments/20201124/d1d2e4ac/attachment.htm>
Previous message (by thread): [CP2K-user] [CP2K:14269] Re: Hybrid functional calculation problem
Next message (by thread): [CP2K-user] [CP2K:14273] Re: Hybrid functional calculation problem
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the CP2K-user mailing list