Running CDFT Tutorial Calculation on Cluster
Nico Holmberg
holmbe... at gmail.com
Tue Sep 11 19:43:57 UTC 2018
Hi Brian,
Sorry for the long delay in replying, I had a couple of tight deadlines
that required my full attention.
I compiled CP2K using version of 17.0.4 20170411 of the Intel Fortran
compiler, Intel MPI and MKL. You can find my arch file below. I ran the
tutorial files with 1, 2, and 24 MPI processes and did not encounter any
issues.
Looking at the stack trace you included in your last post, it seems that
the calculation is crashing somewhere inside the MPI I/O routine that CP2K
is calling. This looks like a library issue to me. Are you able to provide
any more information about how your binary has been compiled?
By the way, if you have access to the latest development version of CP2K
(dated yesterday), you can disable MPI I/O to force CP2K to use the serial
versions of the cube writer/reader. This will bypass your issue without
fixing the underlying issue. See discussion in this post
<https://groups.google.com/forum/#!topic/cp2k/RgsNKmQtVXw> for more
information.
# Bare bones arch file for building CP2K with the Intel compilation suite
# Tested with ifort (IFORT) + Intel MPI + MKL version 17.0.4 20170411
# Build tools
CC = icc
CPP =
FC = mpiifort
LD = mpiifort
AR = ar -r
# Flags and libraries
CPPFLAGS =
DFLAGS = -D__BLACS -D__INTEL -D__MKL -D__FFTW3 -D__parallel
-D__SCALAPACK \
-D__HAS_NO_SHARED_GLIBC
CFLAGS = $(DFLAGS)
FCFLAGS = $(DFLAGS) -O2 -g -traceback -fp-model precise -fp-model source
-free \
-I$(MKLROOT)/include -I$(MKLROOT)/include/fftw
LDFLAGS = $(FCFLAGS)
LDFLAGS_C = $(FCFLAGS) -nofor_main
LIBS = -Wl,--start-group \
$(MKLROOT)/lib/intel64/libmkl_scalapack_lp64.a \
$(MKLROOT)/lib/intel64/libmkl_intel_lp64.a \
$(MKLROOT)/lib/intel64/libmkl_sequential.a \
$(MKLROOT)/lib/intel64/libmkl_core.a \
$(MKLROOT)/lib/intel64/libmkl_blacs_intelmpi_lp64.a \
-Wl,--end-group \
-lpthread -lm -ldl
# Required due to memory leak that occurs if high optimisations are used
mp2_optimize_ri_basis.o: mp2_optimize_ri_basis.F
$(FC) -c $(subst O2,O0,$(FCFLAGS)) $<
torstai 9. elokuuta 2018 19.51.08 UTC+3 Brian Day kirjoitti:
>
> Actually, the error message is slightly different, see below:
>
> forrtl: severe (174): SIGSEGV, segmentation fault occurred
>
> Image PC Routine Line Source
>
> cp2k.popt 000000000D730E14 Unknown Unknown
> Unknown
>
> libpthread-2.17.s 00002ABEBC47F5E0 Unknown Unknown
> Unknown
>
> libmpi.so.12 00002ABEBD7DA1BA PMPI_File_write_a Unknown Unknown
>
> libmpifort.so.12. 00002ABEBCF2F1AE pmpi_file_write_a Unknown
> Unknown
>
> cp2k.popt 0000000002F17BCC message_passing_m 3315
> message_passing.F
>
> cp2k.popt 0000000002A33AFC realspace_grid_cu 698
> realspace_grid_cube.F
>
> cp2k.popt 0000000002A31F4D realspace_grid_cu 211
> realspace_grid_cube.F
>
> cp2k.popt 0000000000A06F9D cp_realspace_grid 64
> cp_realspace_grid_cube.F
>
> cp2k.popt 0000000000A9F32B qs_scf_post_gpw_m 2651
> qs_scf_post_gpw.F
>
> cp2k.popt 0000000000A883B1 qs_scf_post_gpw_m 2001
> qs_scf_post_gpw.F
>
> cp2k.popt 0000000000EB6610 qs_scf_post_scf_m 70
> qs_scf_post_scf.F
>
> cp2k.popt 00000000017A267F qs_scf_mp_scf_ 285
> qs_scf.F
>
> cp2k.popt 0000000000BA7709 qs_energy_mp_qs_e 86
> qs_energy.F
>
> cp2k.popt 0000000000C52681 qs_force_mp_qs_ca 115
> qs_force.F
>
> cp2k.popt 000000000096F4AA force_env_methods 242
> force_env_methods.F
>
> cp2k.popt 000000000043BCAC cp2k_runs_mp_run_ 323
> cp2k_runs.F
>
> cp2k.popt 0000000000432814 MAIN__ 281 cp2k.F
>
> cp2k.popt 000000000043151E Unknown Unknown
> Unknown
>
> libc-2.17.so 00002ABEBE194C05 __libc_start_main Unknown Unknown
>
> cp2k.popt 0000000000431429 Unknown Unknown
> Unknown
>
> Thanks again for all your help so far!
>
> -Brian
>
> On Thursday, August 9, 2018 at 12:49:45 PM UTC-4, Brian Day wrote:
>>
>> Hi Nico,
>>
>> Sorry for the long delayed reply, I had forgotten to check this thread
>> for some time!
>>
>> ifort --version returns: ifort (IFORT) 17.0.04 20170411.
>> Additionally, I get the same error message when I reduce the number of
>> mpi tasks to 4 (2 per node, 2 nodes).
>>
>> Best,
>> Brian
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.cp2k.org/archives/cp2k-user/attachments/20180911/426b4a09/attachment.htm>
More information about the CP2K-user
mailing list