Running CDFT Tutorial Calculation on Cluster

Nico Holmberg holmbe... at gmail.com
Tue Sep 11 19:43:57 UTC 2018


Hi Brian,

Sorry for the long delay in replying, I had a couple of tight deadlines 
that required my full attention. 

I compiled CP2K using version of 17.0.4 20170411 of the Intel Fortran 
compiler, Intel MPI and MKL. You can find my arch file below. I ran the 
tutorial files with 1, 2, and 24 MPI processes and did not encounter any 
issues. 

Looking at the stack trace you included in your last post, it seems that 
the calculation is crashing somewhere inside the MPI I/O routine that CP2K 
is calling. This looks like a library issue to me. Are you able to provide 
any more information about how your binary has been compiled?

By the way, if you have access to the latest development version of CP2K 
(dated yesterday), you can disable MPI I/O to force CP2K to use the serial 
versions of the cube writer/reader. This will bypass your issue without 
fixing the underlying issue. See discussion in this post 
<https://groups.google.com/forum/#!topic/cp2k/RgsNKmQtVXw> for more 
information.

# Bare bones arch file for building CP2K with the Intel compilation suite
# Tested with ifort (IFORT) + Intel MPI + MKL version 17.0.4 20170411 

# Build tools
CC       = icc
CPP      =
FC       = mpiifort
LD       = mpiifort
AR       = ar -r

# Flags and libraries
CPPFLAGS =

DFLAGS   = -D__BLACS -D__INTEL -D__MKL -D__FFTW3 -D__parallel 
-D__SCALAPACK  \
           -D__HAS_NO_SHARED_GLIBC

CFLAGS   = $(DFLAGS)

FCFLAGS  = $(DFLAGS) -O2 -g -traceback -fp-model precise -fp-model source 
-free  \
           -I$(MKLROOT)/include -I$(MKLROOT)/include/fftw

LDFLAGS  = $(FCFLAGS)

LDFLAGS_C = $(FCFLAGS) -nofor_main

LIBS     = -Wl,--start-group \
           $(MKLROOT)/lib/intel64/libmkl_scalapack_lp64.a \
           $(MKLROOT)/lib/intel64/libmkl_intel_lp64.a \
           $(MKLROOT)/lib/intel64/libmkl_sequential.a \
           $(MKLROOT)/lib/intel64/libmkl_core.a \
           $(MKLROOT)/lib/intel64/libmkl_blacs_intelmpi_lp64.a \
           -Wl,--end-group \
           -lpthread -lm -ldl

# Required due to memory leak that occurs if high optimisations are used
mp2_optimize_ri_basis.o: mp2_optimize_ri_basis.F
$(FC) -c $(subst O2,O0,$(FCFLAGS)) $<


torstai 9. elokuuta 2018 19.51.08 UTC+3 Brian Day kirjoitti:
>
> Actually, the error message is slightly different, see below:
>
> forrtl: severe (174): SIGSEGV, segmentation fault occurred
>
> Image              PC                Routine            Line        Source
>
> cp2k.popt          000000000D730E14  Unknown               Unknown  
> Unknown
>
> libpthread-2.17.s  00002ABEBC47F5E0  Unknown               Unknown  
> Unknown
>
> libmpi.so.12   00002ABEBD7DA1BA  PMPI_File_write_a     Unknown  Unknown
>
> libmpifort.so.12.  00002ABEBCF2F1AE  pmpi_file_write_a     Unknown  
> Unknown
>
> cp2k.popt          0000000002F17BCC  message_passing_m        3315  
> message_passing.F
>
> cp2k.popt          0000000002A33AFC  realspace_grid_cu         698  
> realspace_grid_cube.F
>
> cp2k.popt          0000000002A31F4D  realspace_grid_cu         211  
> realspace_grid_cube.F
>
> cp2k.popt          0000000000A06F9D  cp_realspace_grid          64  
> cp_realspace_grid_cube.F
>
> cp2k.popt          0000000000A9F32B  qs_scf_post_gpw_m        2651  
> qs_scf_post_gpw.F
>
> cp2k.popt          0000000000A883B1  qs_scf_post_gpw_m        2001  
> qs_scf_post_gpw.F
>
> cp2k.popt          0000000000EB6610  qs_scf_post_scf_m          70  
> qs_scf_post_scf.F
>
> cp2k.popt          00000000017A267F  qs_scf_mp_scf_            285  
> qs_scf.F
>
> cp2k.popt          0000000000BA7709  qs_energy_mp_qs_e          86  
> qs_energy.F
>
> cp2k.popt          0000000000C52681  qs_force_mp_qs_ca         115  
> qs_force.F
>
> cp2k.popt          000000000096F4AA  force_env_methods         242  
> force_env_methods.F
>
> cp2k.popt          000000000043BCAC  cp2k_runs_mp_run_         323  
> cp2k_runs.F
>
> cp2k.popt          0000000000432814  MAIN__                    281  cp2k.F
>
> cp2k.popt          000000000043151E  Unknown               Unknown  
> Unknown
>
> libc-2.17.so   00002ABEBE194C05  __libc_start_main     Unknown  Unknown
>
> cp2k.popt          0000000000431429  Unknown               Unknown  
> Unknown
>
> Thanks again for all your help so far!
>
> -Brian
>
> On Thursday, August 9, 2018 at 12:49:45 PM UTC-4, Brian Day wrote:
>>
>> Hi Nico,
>>
>> Sorry for the long delayed reply, I had forgotten to check this thread 
>> for some time! 
>>
>> ifort --version returns: ifort (IFORT) 17.0.04 20170411.
>> Additionally, I get the same error message when I reduce the number of 
>> mpi tasks to 4 (2 per node, 2 nodes).
>>
>> Best,
>>      Brian
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.cp2k.org/archives/cp2k-user/attachments/20180911/426b4a09/attachment.htm>


More information about the CP2K-user mailing list