[CP2K-user] [CP2K:14265] Re: Hybrid functional calculation problem
fa...@gmail.com
fabia... at gmail.com
Sun Nov 22 16:42:47 UTC 2020
Can cp2k access all the memory on the cluster? On linux you can use
ulimit -s unlimited
to remove any limit on the amount of memory a process can use.
I usually use SCREEN_ON_INITIAL_P. I found that for large systems it is
faster to run two energy minimizations with the key word enabled (such that
the second restarts from a converged PBE0 wfn) than running a single
minimization without SCREEN_ON_INITIAL_P. But that probably depends on the
system you simulate.
You should converge the cutoff with respect to the properties that you are
interested in. Run a test system with increasing cutoff and look at, e.g.
the energy, pdos, etc.
Number of sph. ERI's calculated on the fly: 4763533420139
This number should always be 0. If it is larger, increase the memory cp2k
has available.
Fabian
On Sunday, 22 November 2020 at 17:24:13 UTC+1 Lucas Lodeiro wrote:
> Dear Fabian,
>
> Thanks for your advise. I forgot to clarify the time ejecution... my
> mistake.
> The calculation runs for 5 or 7 minutes, and stops... the walltime for the
> calculation was set as 72hrs, then I do not believe this is the problem.
> Now I am running the same input in a littler cluster (different form the
> problematic crash) with 64 proc and 250GB RAM, and the calculation works
> fine (so so slow, 9 hr per scf step, but runs... the total RAM assigned for
> the ERI's is not sufficient but the problem is not appear)... It is no
> practical to use this little cluster, then I need to fix the problem in the
> big one, to use more RAM and more processors (more than 220 it is
> possible), but as the program does not show what is happening, I cannot
> tell anything to the cluster admin to recompile or fix the problem. :(
>
> This is the output in the little cluster:
>
> Step Update method Time Convergence Total energy
> Change
>
> ------------------------------------------------------------------------------
>
> HFX_MEM_INFO| Est. max. program size before HFX [MiB]:
> 1371
>
> *** WARNING in hfx_energy_potential.F:605 :: The Kohn Sham matrix is not
> ***
>
> *** 100% occupied. This may result in incorrect Hartree-Fock results. Try
> ***
> *** to decrease EPS_PGF_ORB and EPS_FILTER_MATRIX in the QS section. For
> ***
> *** more information see FAQ: https://www.cp2k.org/faq:hfx_eps_warning
> ***
>
> HFX_MEM_INFO| Number of cart. primitive ERI's calculated:
> 27043173676632
> HFX_MEM_INFO| Number of sph. ERI's calculated:
> 4879985997918
> HFX_MEM_INFO| Number of sph. ERI's stored in-core:
> 116452577779
> HFX_MEM_INFO| Number of sph. ERI's stored on disk:
> 0
> HFX_MEM_INFO| Number of sph. ERI's calculated on the fly:
> 4763533420139
> HFX_MEM_INFO| Total memory consumption ERI's RAM [MiB]:
> 143042
> HFX_MEM_INFO| Whereof max-vals [MiB]:
> 1380
> HFX_MEM_INFO| Total compression factor ERI's RAM:
> 6.21
> HFX_MEM_INFO| Total memory consumption ERI's disk [MiB]:
> 0
> HFX_MEM_INFO| Total compression factor ERI's disk:
> 0.00
> HFX_MEM_INFO| Size of density/Fock matrix [MiB]:
> 266
> HFX_MEM_INFO| Size of buffers [MiB]:
> 98
> HFX_MEM_INFO| Number of periodic image cells considered:
> 5
> HFX_MEM_INFO| Est. max. program size after HFX [MiB]:
> 3834
>
> 1 NoMix/Diag. 0.40E+00 ****** 5.46488333 -20625.2826573514
> -2.06E+04
>
> About the SCREEN_ON_INITIAL_P, I read that to use it, you need a very good
> guess (more than de GGA converged one) as for example the last step or
> frame from a GEO_OPT or MD... Is it really useful when the guess is the GGA
> wavefunction?
> About the CUTOFF_RADIUS, I read that 6 or 7 it is a good compromise, and
> as my cell is approximately twice, I use the minimal image convention to
> decide the 8.62 number which is near the recomended (6 or 7). If I reduce
> it, does the computational cost diminish considerably?
>
> Regards - Lucas
>
> El dom, 22 nov 2020 a las 12:53, fa... at gmail.com (<fa... at gmail.com>)
> escribió:
>
>> Dear Lucas,
>>
>> cp2k was computes the four-center integrals during (or prior) to the
>> first SCF cycle. I assume the job ran out of time during this task For a
>> system with more than 1000 atoms this takes a lot of time. With only 220
>> CPUs this could take several hours in fact.
>>
>> To speed up the calculation you should use SCREEN_ON_INITIAL_P T and
>> restart using a well converged PBE wfn. Other than that, there is little
>> you can do other than assign the job more time and/or CPUs. (Of course,
>> reducing CUTOFF_RADIUS 8.62 would help too but could negatively
>> affect the result).
>>
>> Cheers,
>> Fabian
>>
>> On Sunday, 22 November 2020 at 01:21:05 UTC+1 Lucas Lodeiro wrote:
>>
>>> Hi all,
>>> I need to perform a hybrid calculation with CP2K7.1, over a big system
>>> (+1000 atoms). I study the manual, the tutorials and some videos of CP2K
>>> developers to improve my input. But the program exits the calculation when
>>> the HF part is running... I see the memory usage on the fly, and there is
>>> no peak which explains the fail (I used 4000Mb with 220 processors).
>>> The output does not show some explanation... Thinking in the memory, I
>>> try with a largemem node at our cluster, using 15000Mb with 220 processors,
>>> but the program exists at the same point without message, just killing the
>>> process.
>>> The output shows a warning:
>>>
>>> *** WARNING in hfx_energy_potential.F:591 :: The Kohn Sham matrix is
>>> not ***
>>> *** 100% occupied. This may result in incorrect Hartree-Fock results.
>>> Try ***
>>> *** to decrease EPS_PGF_ORB and EPS_FILTER_MATRIX in the QS section.
>>> For ***
>>> *** more information see FAQ: https://www.cp2k.org/faq:hfx_eps_warning
>>> ***
>>>
>>> but I read this is not a very complicated issue, and the calculation has
>>> to continue and not crash
>>> Also I decrease the EPS__PGF_ORB, but the warning and the problem
>>> persist.
>>>
>>> I do not know if the problem could be located in other parts of my
>>> input... for example I use the PBE0-T_C-LR (I use PBC for XY), and ADMM. In
>>> the ADMM options, I use ADMM_PURIFICATION_METHOD = NONE, due to I read that
>>> ADMM1 is the only one useful for smearing calculations.
>>>
>>> I run this system with PBE (for the first guess of PBE0), and there is
>>> no problem in that case.
>>> Moreover, I try with other CP2K versions (7.0, 6.1 and 5.1) compiled
>>> into the cluster with (libint_max_am=6), and the calculation crash, but
>>> show this problem:
>>>
>>>
>>> *******************************************************************************
>>> * ___
>>> *
>>> * / \
>>> *
>>> * [ABORT]
>>> *
>>> * \___/ CP2K and libint were compiled with different
>>> LIBINT_MAX_AM. *
>>> * |
>>> *
>>> * O/|
>>> *
>>> * /| |
>>> *
>>> * / \
>>> hfx_libint_wrapper.F:134 *
>>>
>>> *******************************************************************************
>>>
>>>
>>> ===== Routine Calling Stack =====
>>>
>>> 2 hfx_create
>>> 1 CP2K
>>>
>>> It seems like this problem is not present in the 7.1 version, as the
>>> program does not show it, and the compilation information does not
>>> show LIBINT_MAX_AM value...
>>>
>>> If somebody could give me some advice, I will appreciate it. :)
>>> I attach the input file, and the output file for 7.1 version.
>>>
>>> Regards - Lucas Lodeiro
>>>
>>> --
>> You received this message because you are subscribed to the Google Groups
>> "cp2k" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to cp... at googlegroups.com.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/cp2k/96479ce2-d8a3-4ccf-b55c-0e935878f1c0n%40googlegroups.com
>> <https://groups.google.com/d/msgid/cp2k/96479ce2-d8a3-4ccf-b55c-0e935878f1c0n%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.cp2k.org/archives/cp2k-user/attachments/20201122/874ace1e/attachment.htm>
More information about the CP2K-user
mailing list