cuda_tools in CP2K
Wei
wei.a... at googlemail.com
Tue Sep 6 21:59:27 UTC 2011
Dear all,
I am interested in the cuda_tools in cp2k, I have complied the recent
cp2k (Version 2.2.320) with cuda4.0, intel compiler 12, intelmkl
inside the package, and intelmpi (modification based on Linux-x86-64-
dbcsr-cuda.popt, see it at the end).
If I run with "./cp2k.popt test.inp", it is ok for about 100 atoms
(Sb,Te) or less, but it gives "CUDA Error: out of memory" when the
system excceeds 120 atoms (it this normal? as each GPU has 6 GB device
memory).
So I wonder how can I run it in parallel. Now I cannot run it with
"mpirun -np 2 ./cp2k.popt test.inp", because it gives the "out of
memory problem" at once.
CUDA Error: out of memory
ASSERTION FAILED: 1.EQ. 0
stack:
error in dev_mem_alloc_i at line 35 with error type -1
message: Could not allocate GPU device memory
6 error in dev_mem_alloc_i at line 35
5 called from dev_mem_alloc_any
4 called from init_card_c
3 called from dbcsr_multrec_init
2 called from dbcsr_mult_m_e_e
1 called from dbcsr_multiply_anytype
Where can I get more information about this cuda_tools? Can this
"popt" version utilize the resources between nodes like the normal
case? As we have 2 GPU(NVIDIA Quadro 6000 (Fermi)) and 2 6-core CPU on
each node, how can I get the best performance out of it? like assign
the job on several nodes with several MPI-core to control two GPU on
each node? How?
Thanks a lot in advance!
NVCC = nvcc
NVFLAGS = $(DFLAGS) -g -arch sm_20
CC = mpiicc
CPP =
FC = mpiifort
LD = $(FC)
AR = ar -r
CPPFLAGS =
DFLAGS = -D__INTEL -D__FFTSG -D__parallel -D__SCALAPACK -D__BLACS -
D__DBCSR_CUDA
INTEL_INC= /opt/intel/Compiler/12.0/4.191/rwthlnk/mkl/include
MKLPATH = /opt/intel/Compiler/12.0/4.191/rwthlnk/mkl/lib/intel64
FCFLAGS = $(DFLAGS) -I$(INTEL_INC) -O3 -msse2 -heap-arrays 64 -
funroll-loops -fpp -free
LDFLAGS = $(FCFLAGS)
CUDAPATH = /usr/local_rwth/sw/cuda/4.0.17/lib64
LIBS = $(CUDAPATH)/libcudart.so $(CUDAPATH)/
libcufft.so $(CUDAPATH)/libcublas.so $(MKLPATH)/
libmkl_scalapack_lp64.a $(MKLPATH)/libmkl_solver_lp64.a -Wl,--start-
group $(MKLPATH)/libmkl_intel_lp64.a $(MKLPATH)/
libmkl_sequential.a $(MKLPATH)/libmkl_core.a $(MKLPATH)/
libmkl_blacs_intelmpi_lp64.a -Wl,--end-group -lpthread
OBJECTS_ARCHITECTURE = machine_intel.o
Best regards,
Wei
---------------------------------------------------------
Wei ZHANG
PhD student
Institute for Theoretical Solid State Physics
RWTH Aachen University, Germany
More information about the CP2K-user
mailing list