[CP2K-user] [CP2K:17755] Possible Memory Leak
'Francois Gygi' via cp2k
cp2k at googlegroups.com
Sun Sep 25 20:15:59 UTC 2022
If you are using the Intel MKL library, you may consider changing the
version of the library. We have observed (with Qbox, another code) memory
leaks when using
(1) intel/19.1.1
(2) intelmpi/2019.up7+intel-19.1.1
(3) mkl/2020.up1
and the leaks disappeared when switching to older versions
(1) intel/18.0
(2) intelmpi/2018.2.199+intel-18.0
(3) mkl/2018.up2
There is also a documented risk of memory leak when using the fast matrix
multiply in the oneAPI Intel MKL library:
https://www.intel.com/content/www/us/en/develop/documentation/onemkl-linux-developer-guide/top/managing-performance-and-memory/using-memory-functions/avoiding-memory-leaks-in-onemkl.html
Francois
On Friday, September 23, 2022 at 9:00:17 AM UTC-7 Matthias Krack wrote:
> Dear Matthew
>
>
>
> The memory growth for v2022.1 looks not too bad and it might be fine to
> survive longer runs.
>
> I remember issues with memory leaks caused by the MPI implementation.
> Especially OpenMPI showed such problems in the past and that is why I used
> only MPICH for years, because leak checking in CP2K was impossible with
> OpenMPI. Have a look at this issue
> <https://github.com/cp2k/cp2k/issues/1830> from Jan this year for
> instance.
>
> The presence of memory leaks usually does not imply that the results are
> wrong.
>
>
>
> Best regards
>
>
>
> Matthias
>
>
>
> *From: *"cp... at googlegroups.com" <cp... at googlegroups.com> on behalf of
> Matthew Emerson <mr... at uiowa.edu>
> *Reply to: *"cp... at googlegroups.com" <cp... at googlegroups.com>
> *Date: *Friday, 23 September 2022 at 16:55
> *To: *"cp... at googlegroups.com" <cp... at googlegroups.com>
> *Subject: *Re: [CP2K:17749] Possible Memory Leak
>
>
>
> Dear Dr. Matthias,
>
> Sorry for the late response, I wanted to test these things carefully.
> Below is what I have found.
>
> All versions of CP2K that I have tested appear to still have a memory
> leak. The issue is less severe on version 2022.1, but as you can see, in
> the graph "CP2K-2022.1.png", the program keeps growing with time and our
> runs are long. For version 8.1, in graph "CP2K-8.1.png", the problem is
> much more severe (on the same hardware - 56 MPI ranks).
>
> My primary concern is with the correctness of the calculations. We can of
> course restart the program, but how do I know that a code that looks like
> it is leaking is not corrupting data. If you continue the run you started,
> I predict it will continue growing in memory usage. I thank you for your
> time.
>
>
> Sincerely,
> Matthew S. Emerson
>
> On Wednesday, September 21, 2022 at 9:23:41 AM UTC-5 Matthias Krack wrote:
>
> Hi Matthew
>
>
>
> I used this arch file
> <https://github.com/cp2k/cp2k/blob/master/arch/Linux-gnu-x86_64.psmp> to
> build the current cp2k release version 2022.1 on our local cluster,
> basically by running
>
> · source arch/Linux-gnu-x86_64.psmp
>
> in the main cp2k folder and then run make as proposed after the cp2k
> toolchain has been built successfully. This is also done for the continuous
> regression testing (see first two entries in the CP2K dashboard
> <https://dashboard.cp2k.org/index.html>, just click on the “OK” link to
> see the details).
>
>
>
> HTH
>
>
>
> Matthias
>
>
>
> *From: *"cp... at googlegroups.com" <cp... at googlegroups.com> on behalf of
> Matthew Emerson <mr... at uiowa.edu>
> *Reply to: *"cp... at googlegroups.com" <cp... at googlegroups.com>
> *Date: *Wednesday, 21 September 2022 at 15:38
> *To: *"cp... at googlegroups.com" <cp... at googlegroups.com>
> *Subject: *Re: [CP2K:17727] Possible Memory Leak
>
>
>
> Dear Dr. Matthias,
>
> Do you have an ARCH file for this build that you could point me to? I
> would like to build using the same settings to test at ORNL.
>
> Sincerely,
>
> Matthew S. Emerson
>
> On Wednesday, September 21, 2022 at 4:00:21 AM UTC-5 Matthias Krack wrote:
>
> Hello Matthew
>
>
>
> I have run your case on our local compute cluster with CP2K v2022.1 (gnu
> 11.2.0, OpenMPI 4.1.3) using 144 CPU cores. I observe only a small increase
> in memory usage after the usual initial growth during the first MD steps
> (see attached plot).
>
>
>
> Best regards
>
>
>
> Matthias
>
>
>
> *Error! Filename not specified.*
>
>
>
> *From: *"cp... at googlegroups.com" <cp... at googlegroups.com> on behalf of
> Matthew Emerson <mr... at uiowa.edu>
> *Reply to: *"cp... at googlegroups.com" <cp... at googlegroups.com>
> *Date: *Tuesday, 20 September 2022 at 23:21
> *To: *"cp... at googlegroups.com" <cp... at googlegroups.com>
> *Subject: *[CP2K:17721] Possible Memory Leak
>
>
>
> Dear CP2K Developers/Community,
>
>
>
> I have attached an input file which I believe shows an example of a
> possible memory leak in CP2K. It is a typical NVT DFT simulation of molten
> MgCl2 with PBE-D3 dispersion corrections.
>
> I have tried well-tested CP2K builds on our local cluster (v6.1, v8.1,
> v2022.1), our university supercomputer (v6.1, v8.1), and even the
> system-wide installations of CP2K on Cori at Oak Ridge National Lab (v8.1
> and v9.1) and memory usage grows linear with time until either the node
> locks up/dies from insufficient memory usage or the job dies from maximum
> walltime (ORNL). I've done enough testing that I can almost tell how many
> MD steps before the job will die for a given machine with X amount of RAM
> and Y amount of MPI ranks.
>
> I normally wouldn’t email about things like this but I’ve tried multiple
> combinations (w/unit-testing) of GCC, OpenMPI, OpenBLAS/MKL, etc. and
> nothing seems to work. I am hoping this is simply an input file issue or my
> own error.
>
>
>
> Any help will be much appreciated.
>
> Matthew S. Emerson
> Margulis Research Group
> Department of Chemistry
> The University of Iowa
>
> --
> You received this message because you are subscribed to the Google Groups
> "cp2k" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to cp2k+uns... at googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/cp2k/0c20a8f2-3658-4429-a2e9-7f4aa0edb321n%40googlegroups.com
> <https://groups.google.com/d/msgid/cp2k/0c20a8f2-3658-4429-a2e9-7f4aa0edb321n%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>
> --
> You received this message because you are subscribed to the Google Groups
> "cp2k" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to cp2k+uns... at googlegroups.com.
>
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/cp2k/c69d3c0d-e3d2-489c-8040-19ac7c4e396en%40googlegroups.com
> <https://groups.google.com/d/msgid/cp2k/c69d3c0d-e3d2-489c-8040-19ac7c4e396en%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>
> --
> You received this message because you are subscribed to the Google Groups
> "cp2k" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to cp2k+uns... at googlegroups.com.
>
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/cp2k/3bb44147-b5e4-4edf-aa0c-14b6eb9195den%40googlegroups.com
> <https://groups.google.com/d/msgid/cp2k/3bb44147-b5e4-4edf-aa0c-14b6eb9195den%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>
>
--
You received this message because you are subscribed to the Google Groups "cp2k" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cp2k+unsubscribe at googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/cp2k/4ba4d6e5-86ab-4122-b8ff-0261e32c271cn%40googlegroups.com.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.cp2k.org/archives/cp2k-user/attachments/20220925/3bcf9b9f/attachment-0001.htm>
More information about the CP2K-user
mailing list