[CP2K-user] Diagnosing source of Out of Memory error
Bach Nguyen
nguy... at umn.edu
Tue Apr 27 19:33:14 UTC 2021
I'm brand new to using CP2K and trying to figure out why I cannot run these
DFT QM/MM MDs. When I use SE AM1, it works fine but the DFT QM/MM keeps
giving back OOM errors. I am attaching the input and the last few lines of
the trace output.
Looking through the previous posts, it looks like the issue is with OMPI,
but here are my specs. I tried running them on all 128 threads but it still
dies. Memory is 2GB/thread - 128GB in total.
GLOBAL| Total number of message passing processes
64
GLOBAL| Number of threads for this process
48
GLOBAL| This output is from process
0
GLOBAL| CPU model name AMD EPYC 7702 64-Core
Processor
GLOBAL| CPUID
1001
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.cp2k.org/archives/cp2k-user/attachments/20210427/e2c8f845/attachment.htm>
-------------- next part --------------
--------------------------------------------------------
level_shift [a.u.]: 0.00
--------------------------------------------------------
Outer loop SCF in use
No variables optimised in outer loop
eps_scf 1.00E-06
max_scf 10
No outer loop optimization
step_size 5.00E-01
000000:000032<< 4 1 scf_c_write_parameters 0.004 Hostmem: 456 MB GPUmem: 0 MB
000000:000032>> 4 1 qs_env_setup start Hostmem: 456 MB GPUmem: 0 MB
000000:000032>> 5 1 qs_env_rebuild_pw_env start Hostmem: 456 MB GPUmem: 0 MB
000000:000032>> 6 1 pw_env_create start Hostmem: 456 MB GPUmem: 0 MB
000000:000032<< 6 1 pw_env_create 0.000 Hostmem: 456 MB GPUmem: 0 MB
000000:000032>> 6 1 pw_env_rebuild start Hostmem: 456 MB GPUmem: 0 MB
000000:000032>> 7 3 mp_comm_dup start Hostmem: 456 MB GPUmem: 0 MB
000000:000032<< 7 3 mp_comm_dup 0.000 Hostmem: 456 MB GPUmem: 0 MB
000000:000032>> 7 728 mp_environ_l start Hostmem: 456 MB GPUmem: 0 MB
000000:000032<< 7 728 mp_environ_l 0.000 Hostmem: 456 MB GPUmem: 0 MB
000000:000032>> 7 2 pw_grid_setup start Hostmem: 456 MB GPUmem: 0 MB
000000:000032>> 8 2 pw_grid_setup_internal start Hostmem: 456 MB GPUmem: 0 MB
000000:000032>> 9 10 mp_sum_l start Hostmem: 457 MB GPUmem: 0 MB
000000:000032<< 9 10 mp_sum_l 0.000 Hostmem: 457 MB GPUmem: 0 MB
000000:000032>> 9 3 mp_sum_im start Hostmem: 457 MB GPUmem: 0 MB
000000:000032<< 9 3 mp_sum_im 0.021 Hostmem: 457 MB GPUmem: 0 MB
000000:000032>> 9 2 pw_grid_distribute start Hostmem: 457 MB GPUmem: 0 MB
000000:000032>> 10 2 mp_cart_create start Hostmem: 457 MB GPUmem: 0 MB
000000:000032<< 10 2 mp_cart_create 0.000 Hostmem: 457 MB GPUmem: 0 MB
000000:000032>> 10 2 mp_cart_rank start Hostmem: 457 MB GPUmem: 0 MB
000000:000032<< 10 2 mp_cart_rank 0.000 Hostmem: 457 MB GPUmem: 0 MB
000000:000032>> 10 33 mp_cart_coords start Hostmem: 457 MB GPUmem: 0 MB
000000:000032<< 10 33 mp_cart_coords 0.000 Hostmem: 457 MB GPUmem: 0 MB
000000:000032>> 10 34 mp_cart_coords start Hostmem: 457 MB GPUmem: 0 MB
000000:000032<< 10 34 mp_cart_coords 0.000 Hostmem: 457 MB GPUmem: 0 MB
000000:000032>> 10 35 mp_cart_coords start Hostmem: 457 MB GPUmem: 0 MB
000000:000032<< 10 35 mp_cart_coords 0.000 Hostmem: 457 MB GPUmem: 0 MB
000000:000032>> 10 36 mp_cart_coords start Hostmem: 457 MB GPUmem: 0 MB
000000:000032<< 10 36 mp_cart_coords 0.000 Hostmem: 457 MB GPUmem: 0 MB
000000:000032>> 10 37 mp_cart_coords start Hostmem: 457 MB GPUmem: 0 MB
000000:000032<< 10 37 mp_cart_coords 0.000 Hostmem: 457 MB GPUmem: 0 MB
000000:000032>> 10 38 mp_cart_coords start Hostmem: 457 MB GPUmem: 0 MB
000000:000032<< 10 38 mp_cart_coords 0.000 Hostmem: 457 MB GPUmem: 0 MB
000000:000032>> 10 39 mp_cart_coords start Hostmem: 457 MB GPUmem: 0 MB
000000:000032<< 10 39 mp_cart_coords 0.000 Hostmem: 457 MB GPUmem: 0 MB
000000:000032>> 10 40 mp_cart_coords start Hostmem: 457 MB GPUmem: 0 MB
000000:000032<< 10 40 mp_cart_coords 0.000 Hostmem: 457 MB GPUmem: 0 MB
000000:000032>> 10 41 mp_cart_coords start Hostmem: 457 MB GPUmem: 0 MB
000000:000032<< 10 41 mp_cart_coords 0.000 Hostmem: 457 MB GPUmem: 0 MB
000000:000032>> 10 42 mp_cart_coords start Hostmem: 457 MB GPUmem: 0 MB
000000:000032<< 10 42 mp_cart_coords 0.000 Hostmem: 457 MB GPUmem: 0 MB
000000:000032>> 10 43 mp_cart_coords start Hostmem: 457 MB GPUmem: 0 MB
000000:000032<< 10 43 mp_cart_coords 0.000 Hostmem: 457 MB GPUmem: 0 MB
000000:000032>> 10 44 mp_cart_coords start Hostmem: 457 MB GPUmem: 0 MB
000000:000032<< 10 44 mp_cart_coords 0.000 Hostmem: 457 MB GPUmem: 0 MB
000000:000032>> 10 45 mp_cart_coords start Hostmem: 457 MB GPUmem: 0 MB
000000:000032<< 10 45 mp_cart_coords 0.000 Hostmem: 457 MB GPUmem: 0 MB
000000:000032>> 10 46 mp_cart_coords start Hostmem: 457 MB GPUmem: 0 MB
000000:000032<< 10 46 mp_cart_coords 0.000 Hostmem: 457 MB GPUmem: 0 MB
000000:000032>> 10 47 mp_cart_coords start Hostmem: 457 MB GPUmem: 0 MB
000000:000032<< 10 47 mp_cart_coords 0.000 Hostmem: 457 MB GPUmem: 0 MB
000000:000032>> 10 48 mp_cart_coords start Hostmem: 457 MB GPUmem: 0 MB
000000:000032<< 10 48 mp_cart_coords 0.000 Hostmem: 457 MB GPUmem: 0 MB
000000:000032>> 10 49 mp_cart_coords start Hostmem: 457 MB GPUmem: 0 MB
000000:000032<< 10 49 mp_cart_coords 0.000 Hostmem: 457 MB GPUmem: 0 MB
000000:000032>> 10 50 mp_cart_coords start Hostmem: 457 MB GPUmem: 0 MB
000000:000032<< 10 50 mp_cart_coords 0.000 Hostmem: 457 MB GPUmem: 0 MB
000000:000032>> 10 51 mp_cart_coords start Hostmem: 457 MB GPUmem: 0 MB
000000:000032<< 10 51 mp_cart_coords 0.000 Hostmem: 457 MB GPUmem: 0 MB
000000:000032>> 10 52 mp_cart_coords start Hostmem: 457 MB GPUmem: 0 MB
000000:000032<< 10 52 mp_cart_coords 0.000 Hostmem: 457 MB GPUmem: 0 MB
000000:000032>> 10 53 mp_cart_coords start Hostmem: 457 MB GPUmem: 0 MB
000000:000032<< 10 53 mp_cart_coords 0.000 Hostmem: 457 MB GPUmem: 0 MB
000000:000032>> 10 54 mp_cart_coords start Hostmem: 457 MB GPUmem: 0 MB
000000:000032<< 10 54 mp_cart_coords 0.000 Hostmem: 457 MB GPUmem: 0 MB
000000:000032>> 10 55 mp_cart_coords start Hostmem: 457 MB GPUmem: 0 MB
000000:000032<< 10 55 mp_cart_coords 0.000 Hostmem: 457 MB GPUmem: 0 MB
000000:000032>> 10 56 mp_cart_coords start Hostmem: 457 MB GPUmem: 0 MB
000000:000032<< 10 56 mp_cart_coords 0.000 Hostmem: 457 MB GPUmem: 0 MB
000000:000032>> 10 57 mp_cart_coords start Hostmem: 457 MB GPUmem: 0 MB
000000:000032<< 10 57 mp_cart_coords 0.000 Hostmem: 457 MB GPUmem: 0 MB
000000:000032>> 10 58 mp_cart_coords start Hostmem: 457 MB GPUmem: 0 MB
000000:000032<< 10 58 mp_cart_coords 0.000 Hostmem: 457 MB GPUmem: 0 MB
000000:000032>> 10 59 mp_cart_coords start Hostmem: 457 MB GPUmem: 0 MB
000000:000032<< 10 59 mp_cart_coords 0.000 Hostmem: 457 MB GPUmem: 0 MB
000000:000032>> 10 60 mp_cart_coords start Hostmem: 457 MB GPUmem: 0 MB
000000:000032<< 10 60 mp_cart_coords 0.000 Hostmem: 457 MB GPUmem: 0 MB
000000:000032>> 10 61 mp_cart_coords start Hostmem: 457 MB GPUmem: 0 MB
000000:000032<< 10 61 mp_cart_coords 0.000 Hostmem: 457 MB GPUmem: 0 MB
000000:000032>> 10 62 mp_cart_coords start Hostmem: 457 MB GPUmem: 0 MB
000000:000032<< 10 62 mp_cart_coords 0.000 Hostmem: 457 MB GPUmem: 0 MB
000000:000032>> 10 63 mp_cart_coords start Hostmem: 457 MB GPUmem: 0 MB
000000:000032<< 10 63 mp_cart_coords 0.000 Hostmem: 457 MB GPUmem: 0 MB
000000:000032>> 10 64 mp_cart_coords start Hostmem: 457 MB GPUmem: 0 MB
000000:000032<< 10 64 mp_cart_coords 0.000 Hostmem: 457 MB GPUmem: 0 MB
000000:000032>> 10 2 mp_sum_iv start Hostmem: 475 MB GPUmem: 0 MB
000000:000032<< 10 2 mp_sum_iv 0.002 Hostmem: 475 MB GPUmem: 0 MB
000000:000032<< 9 2 pw_grid_distribute 0.090 Hostmem: 475 MB GPUmem: 0 MB
000000:000032>> 9 2 pw_grid_allocate start Hostmem: 475 MB GPUmem: 0 MB
000000:000032<< 9 2 pw_grid_allocate 0.000 Hostmem: 475 MB GPUmem: 0 MB
000000:000032>> 9 2 pw_grid_assign start Hostmem: 475 MB GPUmem: 0 MB
-------------- next part --------------
A non-text attachment was scrubbed...
Name: metadynamics_dft.inp
Type: chemical/x-gamess-input
Size: 4116 bytes
Desc: not available
URL: <https://lists.cp2k.org/archives/cp2k-user/attachments/20210427/e2c8f845/attachment.inp>
More information about the CP2K-user
mailing list