[CP2K-user] [CP2K:19312] Memory Leak on CP2k 9.1

Quentin Pessemesse q.pessemesse at gmail.com
Fri Oct 6 12:59:02 UTC 2023


Dear Matthias, 
Thank you kindly for your advice, I will try these different versions as 
soon as possible.
I've built the docker image for an openmpi version of CP2k on the cluster. 
With version 2023.1, I used to source the environment variables using " 
source /opt/cp2k-toolchain/install/setup". This does not work anymore. Is 
it a problem on the image's end or on the cluster end ?
Best,
Quentin


Le vendredi 6 octobre 2023 à 11:27:26 UTC+2, Krack Matthias a écrit :

> Hi Quentin
>
>  
>
> There are some more cp2k 2023.2 docker containers for production 
> <https://github.com/mkrack/cp2k/tree/master/tools/docker/production> 
> available (build with MPICH or OpenMPI) which can also be pulled with 
> apptainer (see READ.me 
> <https://github.com/mkrack/cp2k/blob/master/tools/docker/production/README.md> 
> for details). Maybe, you have more luck with one of these.
>
>  
>
> Best
>
>  
>
> Matthias
>
>  
>
> *From: *cp... at googlegroups.com <cp... at googlegroups.com> on behalf of 
> Quentin Pessemesse <q.pess... at gmail.com>
> *Date: *Friday, 6 October 2023 at 10:47
> *To: *cp2k <cp... at googlegroups.com>
> *Subject: *Re: [CP2K:19310] Memory Leak on CP2k 9.1
>
> Dear all, 
>
> The cluster staff has moved to using a docker with a CP2k image, with CP2k 
> 2023.1 (https://hub.docker.com/r/cp2k/cp2k/tags). The program experiences 
> serious memory leaks (out-of-memory crash after less than 24 hours on AIMD 
> with a system less than 100 atoms with 256 GB). The cluster cannot use 
> intelmpi versions older than intelmpi 20. Is there a more recent version of 
> CP2k which is stable and does not experience this type of large memory 
> leaks?
>
> We've tried to compile our own versions of CP2k with multiple versions of 
> openMPI to no avail. The only stable CP2k version we have is CP2k 6.1, 
> which is used with intelMPI 18 but it is on a legacy container where no new 
> software can be installed.
>
> Has anyone managed to use this docker image succesfully, and if so, which 
> MPI package/version have you used ? If necessary, we can downgrade down to 
> CP2k 9.1.
>
> Best,
>
> Quentin
>
>  
>
> Le mercredi 5 octobre 2022 à 13:19:26 UTC+2, Krack Matthias (PSI) a écrit :
>
> Hi Quentin
>
>  
>
> It seems that you are using OpenMPI which is known to have leaks in some 
> versions. Check this issue 
> <https://github.com/cp2k/cp2k/issues/1830#issuecomment-1012561166> and this 
> discussion <https://groups.google.com/g/cp2k/c/BJ9c21ey0Ls/m/2UDxnhBRAQAJ> 
> here on this forum for further information.
>
>  
>
> HTH
>
>  
>
> Matthias 
>
>  
>
> *From: *"cp... at googlegroups.com" <cp... at googlegroups.com> on behalf of 
> Quentin Pessemesse <q.pess... at gmail.com>
> *Reply to: *"cp... at googlegroups.com" <cp... at googlegroups.com>
> *Date: *Wednesday, 5 October 2022 at 12:39
> *To: *"cp... at googlegroups.com" <cp... at googlegroups.com>
> *Subject: *[CP2K:17807] Memory Leak on CP2k 9.1
>
>  
>
> Dear all, 
>
> Our group is encountering a memory leak issue that makes running DFT-MD 
> impossible with large systems (~100 atoms) on one of the clusters we have 
> access to, even though the same calculations run correctly on other 
> machines.
>
> The cluster support sent me the following valgrind output and asked me to 
> find suggestions on how to proceed. Does anyone have input on how to deal 
> with such memory leaks ?
>
> Best,
>
> Quentin P.
>
>  
>
> ==62== Invalid write of size 4 ==62== at 0x1EA9887: 
> grid_ref_create_task_list (in 
> /ccc/products2/cp2k-9.1/Rhel_8__x86_64/gcc--8.3.0__openmpi--4.0.1/plumed/bin/cp2k.psmp) 
> ==62== by 0x1E7A772: grid_create_task_list (in 
> /ccc/products2/cp2k-9.1/Rhel_8__x86_64/gcc--8.3.0__openmpi--4.0.1/plumed/bin/cp2k.psmp) 
> ==62== by 0x1E790B3: __grid_api_MOD_grid_create_task_list (grid_api.F:938) 
> ==62== by 0x104AA67: __task_list_methods_MOD_generate_qs_task_list 
> (task_list_methods.F:623) ==62== by 0xF58353: 
> __qs_update_s_mstruct_MOD_qs_env_update_s_mstruct 
> (qs_update_s_mstruct.F:187) ==62== by 0xCC03AB: 
> __qs_energy_init_MOD_qs_energies_init (qs_energy_init.F:311) ==62== by 
> 0xCBF0A1: __qs_energy_MOD_qs_energies (qs_energy.F:84) ==62== by 0xCE087E: 
> __qs_force_MOD_qs_forces (qs_force.F:212) ==62== by 0xCE4349: 
> __qs_force_MOD_qs_calc_energy_force (qs_force.F:117) ==62== by 0x9AE2C0: 
> __force_env_methods_MOD_force_env_calc_energy_force 
> (force_env_methods.F:271) ==62== by 0x50CD0C: __md_run_MOD_qs_mol_dyn_low 
> (md_run.F:372) ==62== by 0x50DCF2: __md_run_MOD_qs_mol_dyn (md_run.F:153) 
> ==62== Address 0x26d18670 is 16 bytes before a block of size 10 free'd 
> ==62== at 0x4C35FAC: free (vg_replace_malloc.c:538) ==62== by 0x2B73E68: 
> __offload_api_MOD_offload_timeset (offload_api.F:137) ==62== by 0x2B60EDA: 
> __timings_MOD_timeset_handler (timings.F:278) ==62== by 0x2BE2C6D: 
> __message_passing_MOD_mp_waitany (message_passing.F:4597) ==62== by 
> 0x2963EA5: __realspace_grid_types_MOD_rs_pw_transfer_distributed 
> (realspace_grid_types.F:1439) ==62== by 0x2966559: 
> __realspace_grid_types_MOD_rs_pw_transfer (realspace_grid_types.F:711) 
> ==62== by 0xC9310B: __qs_collocate_density_MOD_calculate_rho_core 
> (qs_collocate_density.F:966) ==62== by 0xF57698: 
> __qs_update_s_mstruct_MOD_qs_env_update_s_mstruct 
> (qs_update_s_mstruct.F:109) ==62== by 0xCC03AB: 
> __qs_energy_init_MOD_qs_energies_init (qs_energy_init.F:311) ==62== by 
> 0xCBF0A1: __qs_energy_MOD_qs_energies (qs_energy.F:84) ==62== by 0xCE087E: 
> __qs_force_MOD_qs_forces (qs_force.F:212) ==62== by 0xCE4349: 
> __qs_force_MOD_qs_calc_energy_force (qs_force.F:117) ==62== Block was 
> alloc'd at ==62== at 0x4C34DFF: malloc (vg_replace_malloc.c:307) ==62== by 
> 0x2F21116: _gfortrani_xmallocarray (memory.c:66) ==62== by 0x2F1C271: 
> _gfortran_string_trim (string_intrinsics_inc.c:167) ==62== by 0x2B73E1C: 
> __offload_api_MOD_offload_timeset (offload_api.F:137) ==62== by 0x2B60EDA: 
> __timings_MOD_timeset_handler (timings.F:278) ==62== by 0x2BE2C6D: 
> __message_passing_MOD_mp_waitany (message_passing.F:4597) ==62== by 
> 0x2963EA5: __realspace_grid_types_MOD_rs_pw_transfer_distributed 
> (realspace_grid_types.F:1439) ==62== by 0x2966559: 
> __realspace_grid_types_MOD_rs_pw_transfer (realspace_grid_types.F:711) 
> ==62== by 0xC9310B: __qs_collocate_density_MOD_calculate_rho_core 
> (qs_collocate_density.F:966) ==62== by 0xF57698: 
> __qs_update_s_mstruct_MOD_qs_env_update_s_mstruct 
> (qs_update_s_mstruct.F:109) ==62== by 0xCC03AB: 
> __qs_energy_init_MOD_qs_energies_init (qs_energy_init.F:311) ==62== by 
> 0xCBF0A1: __qs_energy_MOD_qs_energies (qs_energy.F:84) 
>
> -- 
> You received this message because you are subscribed to the Google Groups 
> "cp2k" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to cp2k+uns... at googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/cp2k/4d753954-6ada-4d19-9092-931469136f8en%40googlegroups.com 
> <https://groups.google.com/d/msgid/cp2k/4d753954-6ada-4d19-9092-931469136f8en%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>
> -- 
> You received this message because you are subscribed to the Google Groups 
> "cp2k" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to cp2k+uns... at googlegroups.com.
>
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/cp2k/c7060174-3f71-4179-8e70-ce5f179859a4n%40googlegroups.com 
> <https://groups.google.com/d/msgid/cp2k/c7060174-3f71-4179-8e70-ce5f179859a4n%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>

-- 
You received this message because you are subscribed to the Google Groups "cp2k" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cp2k+unsubscribe at googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/cp2k/f282de9c-e731-4232-838c-96139638c9bcn%40googlegroups.com.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.cp2k.org/archives/cp2k-user/attachments/20231006/64a57164/attachment-0001.htm>


More information about the CP2K-user mailing list