[CP2K:7596] terrible performance across infiniband

Cohen, Ronald rco... at carnegiescience.edu
Wed Mar 23 17:29:30 UTC 2016


So the problem is solved! I needed to rebuild openmpi giving the torque
directory etc:

> ./configure --prefix=/home/rcohen --with-tm=/opt/torque
> make clean
> make -j 8
> make install
>


So I want to thank you so much! My benchmark for the 64 molecule H2O
benchmark for 16 mpi processes, 8 each on two nodes with OMP=2,
went from 5052 seconds to 266 seconds with this simple fix! Now I will do
further checking and tuning.
Thank you!

Ron

---
Ron Cohen
reco... at gmail.com
skypename: ronaldcohen
twitter: @recohen3


On Wed, Mar 23, 2016 at 11:00 AM, Ronald Cohen <reco... at gmail.com> wrote:
> Dear Gilles,
>
> --with-tm fails. I have now built with
> ./configure --prefix=/home/rcohen --with-tm=/opt/torque
> make clean
> make -j 8
> make install
>



---
Ronald Cohen
Geophysical Laboratory
Carnegie Institution
5251 Broad Branch Rd., N.W.
Washington, D.C. 20015
rco... at carnegiescience.edu
office: 202-478-8937
skype: ronaldcohen
https://twitter.com/recohen3
https://www.linkedin.com/profile/view?id=163327727

On Tue, Mar 22, 2016 at 4:00 PM, Glen MacLachlan <mac... at gwu.edu> wrote:

> Yes, that's correct.
>
> Best,
> Glen
>
> ==========================================
> Glen MacLachlan, PhD
> *HPC Specialist  *
> *for Physical Sciences &*
>
> *Professorial Lecturer, Data Sciences*
>
> Office of Technology Services
> The George Washington University
> 725 21st Street
> Washington, DC 20052
> Suite 211, Corcoran Hall
>
> ==========================================
>
>
>
>
> On Tue, Mar 22, 2016 at 2:58 PM, Cohen, Ronald <rco... at carnegiescience.edu
> > wrote:
>
>> I did this:
>> ibstatus
>> Infiniband device 'mlx4_0' port 1 status:
>>         default gid:     fe80:0000:0000:0000:0002:c903:00ec:9301
>>         base lid:        0x1
>>         sm lid:          0x1
>>         state:           4: ACTIVE
>>         phys state:      5: LinkUp
>>         rate:            56 Gb/sec (4X FDR)
>>         link_layer:      InfiniBand
>>
>> So it seems it is 4X FDR and should get a peak 56v GB/sec!
>>
>> Ron
>>
>>
>> ---
>> Ronald Cohen
>> Geophysical Laboratory
>> Carnegie Institution
>> 5251 Broad Branch Rd., N.W.
>> Washington, D.C. 20015
>> rco... at carnegiescience.edu
>> office: 202-478-8937
>> skype: ronaldcohen
>> https://twitter.com/recohen3
>> https://www.linkedin.com/profile/view?id=163327727
>>
>> On Tue, Mar 22, 2016 at 3:16 PM, Glen MacLachlan <mac... at gwu.edu> wrote:
>>
>>> You want to subscribe to the "user" list and post your messages there.
>>> I'll look for your messages on that board.
>>>
>>> Best,
>>> Glen
>>>
>>> ==========================================
>>> Glen MacLachlan, PhD
>>> *HPC Specialist  *
>>> *for Physical Sciences &*
>>>
>>> *Professorial Lecturer, Data Sciences*
>>>
>>> Office of Technology Services
>>> The George Washington University
>>> 725 21st Street
>>> Washington, DC 20052
>>> Suite 211, Corcoran Hall
>>>
>>> ==========================================
>>>
>>>
>>>
>>>
>>> On Tue, Mar 22, 2016 at 2:15 PM, Cohen, Ronald <
>>> rco... at carnegiescience.edu> wrote:
>>>
>>>> Oh--thank you so much! I will write there.
>>>>
>>>> Ron
>>>>
>>>>
>>>> ---
>>>> Ronald Cohen
>>>> Geophysical Laboratory
>>>> Carnegie Institution
>>>> 5251 Broad Branch Rd., N.W.
>>>> Washington, D.C. 20015
>>>> rco... at carnegiescience.edu
>>>> office: 202-478-8937
>>>> skype: ronaldcohen
>>>> https://twitter.com/recohen3
>>>> https://www.linkedin.com/profile/view?id=163327727
>>>>
>>>> On Tue, Mar 22, 2016 at 3:13 PM, Glen MacLachlan <mac... at gwu.edu>
>>>> wrote:
>>>>
>>>>> No, no...don't misunderstand. I don't mind helping -- I want to figure
>>>>> this out too! Just saying we might want to take it over to the OpenMPI
>>>>> message boards. There you'll get hundreds of OpenMPI experts looking at
>>>>> your problem.
>>>>>
>>>>> https://www.open-mpi.org/faq/
>>>>> https://www.open-mpi.org/community/lists/ompi.php
>>>>>
>>>>>
>>>>>
>>>>> Best,
>>>>> Glen
>>>>>
>>>>> ==========================================
>>>>> Glen MacLachlan, PhD
>>>>> *HPC Specialist  *
>>>>> *for Physical Sciences &*
>>>>>
>>>>> *Professorial Lecturer, Data Sciences*
>>>>>
>>>>> Office of Technology Services
>>>>> The George Washington University
>>>>> 725 21st Street
>>>>> Washington, DC 20052
>>>>> Suite 211, Corcoran Hall
>>>>>
>>>>> ==========================================
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Tue, Mar 22, 2016 at 2:09 PM, Cohen, Ronald <
>>>>> rco... at carnegiescience.edu> wrote:
>>>>>
>>>>>> Yes, thank you so much. Basically I am getting mud even with 2 nodes,
>>>>>> so using more could not be better. I understand it is off topic, so won't
>>>>>> bother you. I have to get this working before I can worry about cp2k
>>>>>> performance!
>>>>>>
>>>>>> Ron
>>>>>>
>>>>>>
>>>>>> ---
>>>>>> Ronald Cohen
>>>>>> Geophysical Laboratory
>>>>>> Carnegie Institution
>>>>>> 5251 Broad Branch Rd., N.W.
>>>>>> Washington, D.C. 20015
>>>>>> rco... at carnegiescience.edu
>>>>>> office: 202-478-8937
>>>>>> skype: ronaldcohen
>>>>>> https://twitter.com/recohen3
>>>>>> https://www.linkedin.com/profile/view?id=163327727
>>>>>>
>>>>>> On Tue, Mar 22, 2016 at 3:07 PM, Glen MacLachlan <mac... at gwu.edu>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi Ron,
>>>>>>>
>>>>>>> I think this is sort of off topic for the CP2K folks and more along
>>>>>>> the lines of OpenMPI but I'm happy to continue the discussion -- I'm afraid
>>>>>>> they might ask us to take it elsewhere though.
>>>>>>>
>>>>>>> So you want to do a couple of things
>>>>>>>
>>>>>>>    1.  vary the number of tasks and look for scaling -- you need to
>>>>>>>    do this across multiple nodes to see what affect infiniband is having -- I
>>>>>>>    assume you know how to ask your scheduler to distribute the tasks across
>>>>>>>    multiple nodes.
>>>>>>>    2. look for the throughput that you expect to be getting from
>>>>>>>    your infiniband fabric. Did you mention what inifiniband you are running?
>>>>>>>    qdr? fdr? You can compare the NPB benchmarks for your ib and ethernet
>>>>>>>    networks. Do you know what your ethernet network throughput is? gigE?
>>>>>>>    10gig? You may want to have a look at this benchmark report that used NPB
>>>>>>>    and NWChem, among others:
>>>>>>>    http://www.dell.com/Downloads/Global/Power/ps1q10-20100215-Mellanox.pdf
>>>>>>>
>>>>>>> Also, not having an admin handy or root access is not to bad of an
>>>>>>> impediment. You can stand up your own instance of openmpi without special
>>>>>>> privileges. Before you start chasing too many benchmarks (which can be
>>>>>>> difficult to resist) you may want to spin up your own OpenMPI instance and
>>>>>>> see if you can beat the ethernet performance.
>>>>>>>
>>>>>>> By the way, when you type ifconfig do you see an interface that
>>>>>>> looks like ib0 or ib1 or something like that?
>>>>>>>
>>>>>>> Best,
>>>>>>> Glen
>>>>>>>
>>>>>>> ==========================================
>>>>>>> Glen MacLachlan, PhD
>>>>>>> *HPC Specialist  *
>>>>>>> *for Physical Sciences &*
>>>>>>>
>>>>>>> *Professorial Lecturer, Data Sciences*
>>>>>>>
>>>>>>> Office of Technology Services
>>>>>>> The George Washington University
>>>>>>> 725 21st Street
>>>>>>> Washington, DC 20052
>>>>>>> Suite 211, Corcoran Hall
>>>>>>>
>>>>>>> ==========================================
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Mar 22, 2016 at 1:05 PM, Cohen, Ronald <
>>>>>>> rco... at carnegiescience.edu> wrote:
>>>>>>>
>>>>>>>> Dear Glen,
>>>>>>>>
>>>>>>>> I made NPB. Which test do you recommend me running? I have run
>>>>>>>> several and it is not clear what to look for.
>>>>>>>>
>>>>>>>> Sincerely,
>>>>>>>>
>>>>>>>> Ron
>>>>>>>>
>>>>>>>>
>>>>>>>> ---
>>>>>>>> Ronald Cohen
>>>>>>>> Geophysical Laboratory
>>>>>>>> Carnegie Institution
>>>>>>>> 5251 Broad Branch Rd., N.W.
>>>>>>>> Washington, D.C. 20015
>>>>>>>> rco... at carnegiescience.edu
>>>>>>>> office: 202-478-8937
>>>>>>>> skype: ronaldcohen
>>>>>>>> https://twitter.com/recohen3
>>>>>>>> https://www.linkedin.com/profile/view?id=163327727
>>>>>>>>
>>>>>>>> On Tue, Mar 22, 2016 at 12:33 PM, Glen MacLachlan <mac... at gwu.edu>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Check with your admin to see what networks are available but if
>>>>>>>>> you disable tcp using mpirun --mca btl ^tcp then you should be
>>>>>>>>> giving MPI no choice but to use IB. You can also increase the verbosity by
>>>>>>>>> adding --mca btl_openib_verbose 1.
>>>>>>>>>
>>>>>>>>> Also, did you run ompi_info --all as Andreas suggested?
>>>>>>>>>
>>>>>>>>> Best,
>>>>>>>>> Glen
>>>>>>>>>
>>>>>>>>> ==========================================
>>>>>>>>> Glen MacLachlan, PhD
>>>>>>>>> *HPC Specialist  *
>>>>>>>>> *for Physical Sciences &*
>>>>>>>>>
>>>>>>>>> *Professorial Lecturer, Data Sciences*
>>>>>>>>>
>>>>>>>>> Office of Technology Services
>>>>>>>>> The George Washington University
>>>>>>>>> 725 21st Street
>>>>>>>>> Washington, DC 20052
>>>>>>>>> Suite 211, Corcoran Hall
>>>>>>>>>
>>>>>>>>> ==========================================
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Tue, Mar 22, 2016 at 11:21 AM, Cohen, Ronald <
>>>>>>>>> rco... at carnegiescience.edu> wrote:
>>>>>>>>>
>>>>>>>>>> OK, I ran xhpl with those flags and got the same 1 GF performance
>>>>>>>>>> as without. So I guess my openmpi is not using ib. I wonder how to turn
>>>>>>>>>> that on! My config.log for the build seems to show that it found
>>>>>>>>>> infiniband. I attached it in case you have time to look. Thank you so much!
>>>>>>>>>>
>>>>>>>>>> Ron
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> ---
>>>>>>>>>> Ronald Cohen
>>>>>>>>>> Geophysical Laboratory
>>>>>>>>>> Carnegie Institution
>>>>>>>>>> 5251 Broad Branch Rd., N.W.
>>>>>>>>>> Washington, D.C. 20015
>>>>>>>>>> rco... at carnegiescience.edu
>>>>>>>>>> office: 202-478-8937
>>>>>>>>>> skype: ronaldcohen
>>>>>>>>>> https://twitter.com/recohen3
>>>>>>>>>> https://www.linkedin.com/profile/view?id=163327727
>>>>>>>>>>
>>>>>>>>>> On Tue, Mar 22, 2016 at 12:12 PM, Glen MacLachlan <
>>>>>>>>>> mac... at gwu.edu> wrote:
>>>>>>>>>>
>>>>>>>>>>> Yeah, the ^ is a regular expression character that means ignore
>>>>>>>>>>> what comes after -- think of it as a negation.
>>>>>>>>>>>
>>>>>>>>>>> Best,
>>>>>>>>>>> Glen
>>>>>>>>>>>
>>>>>>>>>>> ==========================================
>>>>>>>>>>> Glen MacLachlan, PhD
>>>>>>>>>>> *HPC Specialist  *
>>>>>>>>>>> *for Physical Sciences &*
>>>>>>>>>>>
>>>>>>>>>>> *Professorial Lecturer, Data Sciences*
>>>>>>>>>>>
>>>>>>>>>>> Office of Technology Services
>>>>>>>>>>> The George Washington University
>>>>>>>>>>> 725 21st Street
>>>>>>>>>>> Washington, DC 20052
>>>>>>>>>>> Suite 211, Corcoran Hall
>>>>>>>>>>>
>>>>>>>>>>> ==========================================
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Tue, Mar 22, 2016 at 11:04 AM, Cohen, Ronald <
>>>>>>>>>>> rco... at carnegiescience.edu> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Thank you so much. It is a bit difficult because I did not set
>>>>>>>>>>>> up this machine and do not have root access, but I know it is a mess. I
>>>>>>>>>>>> backed up to just try the HPL benchmark.
>>>>>>>>>>>> I am finding 100 GFLOPS one node performance on N=2000 and 16
>>>>>>>>>>>> cores, and 1.5 GFLOPS using two nodes, 8 cores per node. So there is
>>>>>>>>>>>> definately something really wrong. I need to getthis working before I can
>>>>>>>>>>>> worry about threads or cp2k.
>>>>>>>>>>>> Was that a caret in your command above:
>>>>>>>>>>>>
>>>>>>>>>>>> mpirun --mca btl ^tcp
>>>>>>>>>>>>
>>>>>>>>>>>> ?
>>>>>>>>>>>>
>>>>>>>>>>>> I looked through my openmpi build and it seems to have found
>>>>>>>>>>>> the infiniband includes such as they exist on the machine, but I could not
>>>>>>>>>>>> the expected mxm or Mellanox drivers anywhere on the machine.
>>>>>>>>>>>>
>>>>>>>>>>>> I am CCing Peter Fox, the person who volunteers his time for
>>>>>>>>>>>> this machine, and who has root access!
>>>>>>>>>>>>
>>>>>>>>>>>> Sincerely,
>>>>>>>>>>>>
>>>>>>>>>>>> Ron
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> ---
>>>>>>>>>>>> Ronald Cohen
>>>>>>>>>>>> Geophysical Laboratory
>>>>>>>>>>>> Carnegie Institution
>>>>>>>>>>>> 5251 Broad Branch Rd., N.W.
>>>>>>>>>>>> Washington, D.C. 20015
>>>>>>>>>>>> rco... at carnegiescience.edu
>>>>>>>>>>>> office: 202-478-8937
>>>>>>>>>>>> skype: ronaldcohen
>>>>>>>>>>>> https://twitter.com/recohen3
>>>>>>>>>>>> https://www.linkedin.com/profile/view?id=163327727
>>>>>>>>>>>>
>>>>>>>>>>>> On Tue, Mar 22, 2016 at 10:32 AM, Glen MacLachlan <
>>>>>>>>>>>> mac... at gwu.edu> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hi Ron,
>>>>>>>>>>>>>
>>>>>>>>>>>>> There's a chance that OpenMPI wasn't configured to use IB
>>>>>>>>>>>>> properly. Why don't you disable tcp and see if you are using IB?  It's easy
>>>>>>>>>>>>>
>>>>>>>>>>>>> mpirun --mca btl ^tcp ...
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Regarding OpenMP:
>>>>>>>>>>>>> I'm not sure we're converging on the same discussion anymore
>>>>>>>>>>>>> but setting OMP_NUM_THREADS=1 does *not* disable
>>>>>>>>>>>>> multithreading overhead -- you need to compile without the fopenmp to get a
>>>>>>>>>>>>> measure of true single thread performance.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Best,
>>>>>>>>>>>>> Glen
>>>>>>>>>>>>>
>>>>>>>>>>>>> ==========================================
>>>>>>>>>>>>> Glen MacLachlan, PhD
>>>>>>>>>>>>> *HPC Specialist  *
>>>>>>>>>>>>> *for Physical Sciences &*
>>>>>>>>>>>>>
>>>>>>>>>>>>> *Professorial Lecturer, Data Sciences*
>>>>>>>>>>>>>
>>>>>>>>>>>>> Office of Technology Services
>>>>>>>>>>>>> The George Washington University
>>>>>>>>>>>>> 725 21st Street
>>>>>>>>>>>>> Washington, DC 20052
>>>>>>>>>>>>> Suite 211, Corcoran Hall
>>>>>>>>>>>>>
>>>>>>>>>>>>> ==========================================
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Mon, Mar 21, 2016 at 5:05 PM, Ronald Cohen <
>>>>>>>>>>>>> rco... at carnegiescience.edu> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> According to my experience in general, or the cp2k web pages
>>>>>>>>>>>>>> in particular that is not the case.  Please see the performance page for
>>>>>>>>>>>>>> cp2k.  The problem I am sure now is with the openmpi build not using the
>>>>>>>>>>>>>> proper infiniband libraries or drivers.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thank you!
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Ron
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Sent from my iPad
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Mar 21, 2016, at 5:36 PM, Glen MacLachlan <mac... at gwu.edu>
>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> It's hard to talk about the performance when you set
>>>>>>>>>>>>>> OMP_NUM_THREADS = 1 because there is so much overhead associated with
>>>>>>>>>>>>>> OpenMP that launching 1 thread almost always is a performance killer. In
>>>>>>>>>>>>>> fact, OMP_NUM_THREADS=1 never rivals single-threaded performance-wise
>>>>>>>>>>>>>> because of that overhead. No one ever sets  OMP_NUM_THREADS=1 unless they
>>>>>>>>>>>>>> are playing around...We never do that in production jobs. How about when
>>>>>>>>>>>>>> you scale up to 4 or 8 threads?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Glen
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> P.S. I see you're in DC...so am I. I support CP2K for the
>>>>>>>>>>>>>> chemists at GWU. Hope you aren't using Metro to get around the DMV :p
>>>>>>>>>>>>>> On Mar 21, 2016 5:11 PM, "Cohen, Ronald" <
>>>>>>>>>>>>>> rco... at carnegiescience.edu> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Yes I am using hybrid mode. But even if I set
>>>>>>>>>>>>>>> OMP_NUM_THREADS=1 performance is terrible.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> ---
>>>>>>>>>>>>>>> Ronald Cohen
>>>>>>>>>>>>>>> Geophysical Laboratory
>>>>>>>>>>>>>>> Carnegie Institution
>>>>>>>>>>>>>>> 5251 Broad Branch Rd., N.W.
>>>>>>>>>>>>>>> Washington, D.C. 20015
>>>>>>>>>>>>>>> rco... at carnegiescience.edu
>>>>>>>>>>>>>>> office: 202-478-8937
>>>>>>>>>>>>>>> skype: ronaldcohen
>>>>>>>>>>>>>>> https://twitter.com/recohen3
>>>>>>>>>>>>>>> https://www.linkedin.com/profile/view?id=163327727
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Mon, Mar 21, 2016 at 5:04 PM, Glen MacLachlan <
>>>>>>>>>>>>>>> mac... at gwu.edu> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Are you conflating MPI with OpenMP? OMP_NUM_THREADS sets
>>>>>>>>>>>>>>>> the number of threads used by OpenMP and OpenMP doesn't work on a
>>>>>>>>>>>>>>>> distributed memory environment unless you piggyback on MPI which would be a
>>>>>>>>>>>>>>>> hybrid use and I'm not sure CP2K ever worked optimally in hybrid mode or at
>>>>>>>>>>>>>>>> least that's what I've gotten from reading the comments on the source code.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> As for MPI, are you sure your MPI stack was compiled with
>>>>>>>>>>>>>>>> IB bindings? I had similar issues and the problem was that I wasn't
>>>>>>>>>>>>>>>> actually using IB. If you can, disable eth and leave only IB and see what
>>>>>>>>>>>>>>>> happens.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Glen
>>>>>>>>>>>>>>>> On Mar 21, 2016 4:48 PM, "Ronald Cohen" <
>>>>>>>>>>>>>>>> rco... at carnegiescience.edu> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On the dco machine deepcarbon I find decent single node
>>>>>>>>>>>>>>>>> mpi performnace, but running on the same number of processors across two
>>>>>>>>>>>>>>>>> nodes is terrible, even with the infiniband interconect. This is the
>>>>>>>>>>>>>>>>> cp2k  H2O-64 benchmark:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On 16 cores on 1 node: total time 530 seconds
>>>>>>>>>>>>>>>>>  SUBROUTINE                       CALLS  ASD         SELF
>>>>>>>>>>>>>>>>> TIME        TOTAL TIME
>>>>>>>>>>>>>>>>>                                 MAXIMUM       AVERAGE
>>>>>>>>>>>>>>>>>  MAXIMUM  AVERAGE  MAXIMUM
>>>>>>>>>>>>>>>>>  CP2K                                 1  1.0    0.015
>>>>>>>>>>>>>>>>>  0.019  530.306  530.306
>>>>>>>>>>>>>>>>>  -
>>>>>>>>>>>>>>>>>                     -
>>>>>>>>>>>>>>>>>  -                         MESSAGE PASSING PERFORMANCE
>>>>>>>>>>>>>>>>>                     -
>>>>>>>>>>>>>>>>>  -
>>>>>>>>>>>>>>>>>                     -
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>  -------------------------------------------------------------------------------
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>  ROUTINE             CALLS  TOT TIME [s]  AVE VOLUME
>>>>>>>>>>>>>>>>> [Bytes]  PERFORMANCE [MB/s]
>>>>>>>>>>>>>>>>>  MP_Group                5         0.000
>>>>>>>>>>>>>>>>>  MP_Bcast             4103         0.029
>>>>>>>>>>>>>>>>>  44140.             6191.05
>>>>>>>>>>>>>>>>>  MP_Allreduce        21860         7.077
>>>>>>>>>>>>>>>>>  263.                0.81
>>>>>>>>>>>>>>>>>  MP_Gather              62         0.008
>>>>>>>>>>>>>>>>>  320.                2.53
>>>>>>>>>>>>>>>>>  MP_Sync                54         0.001
>>>>>>>>>>>>>>>>>  MP_Alltoall         19407        26.839
>>>>>>>>>>>>>>>>> 648289.              468.77
>>>>>>>>>>>>>>>>>  MP_ISendRecv        21600         0.091
>>>>>>>>>>>>>>>>>  94533.            22371.25
>>>>>>>>>>>>>>>>>  MP_Wait            238786        50.545
>>>>>>>>>>>>>>>>>  MP_comm_split          50         0.004
>>>>>>>>>>>>>>>>>  MP_ISend            97572         0.741
>>>>>>>>>>>>>>>>> 239205.            31518.68
>>>>>>>>>>>>>>>>>  MP_IRecv            97572         8.605
>>>>>>>>>>>>>>>>> 239170.             2711.98
>>>>>>>>>>>>>>>>>  MP_Memory          167778        45.018
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>  -------------------------------------------------------------------------------
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> on 16 cores on 2 nodes: total time 5053 seconds !!
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> SUBROUTINE                       CALLS  ASD         SELF
>>>>>>>>>>>>>>>>> TIME        TOTAL TIME
>>>>>>>>>>>>>>>>>                                 MAXIMUM       AVERAGE
>>>>>>>>>>>>>>>>>  MAXIMUM  AVERAGE  MAXIMUM
>>>>>>>>>>>>>>>>>  CP2K                                 1  1.0    0.311
>>>>>>>>>>>>>>>>>  0.363 5052.904 5052.909
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> -------------------------------------------------------------------------------
>>>>>>>>>>>>>>>>>  -
>>>>>>>>>>>>>>>>>                     -
>>>>>>>>>>>>>>>>>  -                         MESSAGE PASSING PERFORMANCE
>>>>>>>>>>>>>>>>>                     -
>>>>>>>>>>>>>>>>>  -
>>>>>>>>>>>>>>>>>                     -
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>  -------------------------------------------------------------------------------
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>  ROUTINE             CALLS  TOT TIME [s]  AVE VOLUME
>>>>>>>>>>>>>>>>> [Bytes]  PERFORMANCE [MB/s]
>>>>>>>>>>>>>>>>>  MP_Group                5         0.000
>>>>>>>>>>>>>>>>>  MP_Bcast             4119         0.258
>>>>>>>>>>>>>>>>>  43968.              700.70
>>>>>>>>>>>>>>>>>  MP_Allreduce        21892      1546.186
>>>>>>>>>>>>>>>>>  263.                0.00
>>>>>>>>>>>>>>>>>  MP_Gather              62         0.049
>>>>>>>>>>>>>>>>>  320.                0.40
>>>>>>>>>>>>>>>>>  MP_Sync                54         0.071
>>>>>>>>>>>>>>>>>  MP_Alltoall         19407      1507.024
>>>>>>>>>>>>>>>>> 648289.                8.35
>>>>>>>>>>>>>>>>>  MP_ISendRecv        21600         0.104
>>>>>>>>>>>>>>>>>  94533.            19656.44
>>>>>>>>>>>>>>>>>  MP_Wait            238786       513.507
>>>>>>>>>>>>>>>>>  MP_comm_split          50         4.096
>>>>>>>>>>>>>>>>>  MP_ISend            97572         1.102
>>>>>>>>>>>>>>>>> 239206.            21176.09
>>>>>>>>>>>>>>>>>  MP_IRecv            97572         2.739
>>>>>>>>>>>>>>>>> 239171.             8520.75
>>>>>>>>>>>>>>>>>  MP_Memory          167778        18.845
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>  -------------------------------------------------------------------------------
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Any ideas? The code was built with the latest gfortran and
>>>>>>>>>>>>>>>>> I built all of the dependencies, using this arch file.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> CC   = gcc
>>>>>>>>>>>>>>>>> CPP  =
>>>>>>>>>>>>>>>>> FC   = mpif90
>>>>>>>>>>>>>>>>> LD   = mpif90
>>>>>>>>>>>>>>>>> AR   = ar -r
>>>>>>>>>>>>>>>>> PREFIX   = /home/rcohen
>>>>>>>>>>>>>>>>> FFTW_INC   = $(PREFIX)/include
>>>>>>>>>>>>>>>>> FFTW_LIB   = $(PREFIX)/lib
>>>>>>>>>>>>>>>>> LIBINT_INC = $(PREFIX)/include
>>>>>>>>>>>>>>>>> LIBINT_LIB = $(PREFIX)/lib
>>>>>>>>>>>>>>>>> LIBXC_INC  = $(PREFIX)/include
>>>>>>>>>>>>>>>>> LIBXC_LIB  = $(PREFIX)/lib
>>>>>>>>>>>>>>>>> GCC_LIB = $(PREFIX)/gcc-trunk/lib
>>>>>>>>>>>>>>>>> GCC_LIB64  = $(PREFIX)/gcc-trunk/lib64
>>>>>>>>>>>>>>>>> GCC_INC = $(PREFIX)/gcc-trunk/include
>>>>>>>>>>>>>>>>> DFLAGS  = -D__FFTW3 -D__LIBINT -D__LIBXC2\
>>>>>>>>>>>>>>>>>     -D__LIBINT_MAX_AM=7 -D__LIBDERIV_MAX_AM1=6
>>>>>>>>>>>>>>>>> -D__MAX_CONTR=4\
>>>>>>>>>>>>>>>>>     -D__parallel -D__SCALAPACK -D__HAS_smm_dnn -D__ELPA3
>>>>>>>>>>>>>>>>> CPPFLAGS   =
>>>>>>>>>>>>>>>>> FCFLAGS = $(DFLAGS) -O2 -ffast-math -ffree-form
>>>>>>>>>>>>>>>>> -ffree-line-length-none\
>>>>>>>>>>>>>>>>>     -fopenmp -ftree-vectorize -funroll-loops\
>>>>>>>>>>>>>>>>>     -mtune=native  \
>>>>>>>>>>>>>>>>>      -I$(FFTW_INC) -I$(LIBINT_INC) -I$(LIBXC_INC)
>>>>>>>>>>>>>>>>> -I$(MKLROOT)/include \
>>>>>>>>>>>>>>>>>      -I$(GCC_INC)
>>>>>>>>>>>>>>>>> -I$(PREFIX)/include/elpa_openmp-2015.11.001/modules
>>>>>>>>>>>>>>>>> LIBS    =  \
>>>>>>>>>>>>>>>>>     $(PREFIX)/lib/libscalapack.a
>>>>>>>>>>>>>>>>> $(PREFIX)/lib/libsmm_dnn_sandybridge-2015-11-10.a \
>>>>>>>>>>>>>>>>>     $(FFTW_LIB)/libfftw3.a\
>>>>>>>>>>>>>>>>>     $(FFTW_LIB)/libfftw3_threads.a\
>>>>>>>>>>>>>>>>>     $(LIBXC_LIB)/libxcf90.a\
>>>>>>>>>>>>>>>>>     $(LIBXC_LIB)/libxc.a\
>>>>>>>>>>>>>>>>>     $(PREFIX)/lib/liblapack.a  $(PREFIX)/lib/libtmglib.a
>>>>>>>>>>>>>>>>> $(PREFIX)/lib/libgomp.a  \
>>>>>>>>>>>>>>>>>     $(PREFIX)/lib/libderiv.a $(PREFIX)/lib/libint.a
>>>>>>>>>>>>>>>>>  -lelpa_openmp -lgomp -lopenblas
>>>>>>>>>>>>>>>>> LDFLAGS = $(FCFLAGS)  -L$(GCC_LIB64) -L$(GCC_LIB)
>>>>>>>>>>>>>>>>> -static-libgfortran -L$(PREFIX)/lib
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> It was run with  OMP_NUM_THREADS=2 on the two nodes and  OMP_NUM_THREADS=1
>>>>>>>>>>>>>>>>> on the one node.
>>>>>>>>>>>>>>>>> Running with  OMP_NUM_THREADS=1 on two nodes .
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I am now checking whether OMP_NUM_THREADS=1 on two nodes
>>>>>>>>>>>>>>>>> is faster than OMP_NUM_THREADS=2 , but I do not think so.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Ron Cohen
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>> You received this message because you are subscribed to
>>>>>>>>>>>>>>>>> the Google Groups "cp2k" group.
>>>>>>>>>>>>>>>>> To unsubscribe from this group and stop receiving emails
>>>>>>>>>>>>>>>>> from it, send an email to
>>>>>>>>>>>>>>>>> cp2k+uns... at googlegroups.com.
>>>>>>>>>>>>>>>>> To post to this group, send email to cp... at googlegroups.com
>>>>>>>>>>>>>>>>> .
>>>>>>>>>>>>>>>>> Visit this group at https://groups.google.com/group/cp2k.
>>>>>>>>>>>>>>>>> For more options, visit https://groups.google.com/d/optout
>>>>>>>>>>>>>>>>> .
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>> You received this message because you are subscribed to a
>>>>>>>>>>>>>>>> topic in the Google Groups "cp2k" group.
>>>>>>>>>>>>>>>> To unsubscribe from this topic, visit
>>>>>>>>>>>>>>>> https://groups.google.com/d/topic/cp2k/lVLso0oseHU/unsubscribe
>>>>>>>>>>>>>>>> .
>>>>>>>>>>>>>>>> To unsubscribe from this group and all its topics, send an
>>>>>>>>>>>>>>>> email to cp2k+uns... at googlegroups.com.
>>>>>>>>>>>>>>>> To post to this group, send email to cp... at googlegroups.com.
>>>>>>>>>>>>>>>> Visit this group at https://groups.google.com/group/cp2k.
>>>>>>>>>>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>> You received this message because you are subscribed to the
>>>>>>>>>>>>>>> Google Groups "cp2k" group.
>>>>>>>>>>>>>>> To unsubscribe from this group and stop receiving emails
>>>>>>>>>>>>>>> from it, send an email to cp2k+uns... at googlegroups.com.
>>>>>>>>>>>>>>> To post to this group, send email to cp... at googlegroups.com.
>>>>>>>>>>>>>>> Visit this group at https://groups.google.com/group/cp2k.
>>>>>>>>>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> --
>>>>>>>>>>>>>> You received this message because you are subscribed to a
>>>>>>>>>>>>>> topic in the Google Groups "cp2k" group.
>>>>>>>>>>>>>> To unsubscribe from this topic, visit
>>>>>>>>>>>>>> https://groups.google.com/d/topic/cp2k/lVLso0oseHU/unsubscribe
>>>>>>>>>>>>>> .
>>>>>>>>>>>>>> To unsubscribe from this group and all its topics, send an
>>>>>>>>>>>>>> email to cp2k+uns... at googlegroups.com.
>>>>>>>>>>>>>> To post to this group, send email to cp... at googlegroups.com.
>>>>>>>>>>>>>> Visit this group at https://groups.google.com/group/cp2k.
>>>>>>>>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> --
>>>>>>>>>>>>>> You received this message because you are subscribed to the
>>>>>>>>>>>>>> Google Groups "cp2k" group.
>>>>>>>>>>>>>> To unsubscribe from this group and stop receiving emails from
>>>>>>>>>>>>>> it, send an email to cp2k+uns... at googlegroups.com.
>>>>>>>>>>>>>> To post to this group, send email to cp... at googlegroups.com.
>>>>>>>>>>>>>> Visit this group at https://groups.google.com/group/cp2k.
>>>>>>>>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> --
>>>>>>>>>>>>> You received this message because you are subscribed to a
>>>>>>>>>>>>> topic in the Google Groups "cp2k" group.
>>>>>>>>>>>>> To unsubscribe from this topic, visit
>>>>>>>>>>>>> https://groups.google.com/d/topic/cp2k/lVLso0oseHU/unsubscribe
>>>>>>>>>>>>> .
>>>>>>>>>>>>> To unsubscribe from this group and all its topics, send an
>>>>>>>>>>>>> email to cp2k+uns... at googlegroups.com.
>>>>>>>>>>>>> To post to this group, send email to cp... at googlegroups.com.
>>>>>>>>>>>>> Visit this group at https://groups.google.com/group/cp2k.
>>>>>>>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>> You received this message because you are subscribed to the
>>>>>>>>>>>> Google Groups "cp2k" group.
>>>>>>>>>>>> To unsubscribe from this group and stop receiving emails from
>>>>>>>>>>>> it, send an email to cp2k+uns... at googlegroups.com.
>>>>>>>>>>>> To post to this group, send email to cp... at googlegroups.com.
>>>>>>>>>>>> Visit this group at https://groups.google.com/group/cp2k.
>>>>>>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> You received this message because you are subscribed to a topic
>>>>>>>>>>> in the Google Groups "cp2k" group.
>>>>>>>>>>> To unsubscribe from this topic, visit
>>>>>>>>>>> https://groups.google.com/d/topic/cp2k/lVLso0oseHU/unsubscribe.
>>>>>>>>>>> To unsubscribe from this group and all its topics, send an email
>>>>>>>>>>> to cp2k+uns... at googlegroups.com.
>>>>>>>>>>> To post to this group, send email to cp... at googlegroups.com.
>>>>>>>>>>> Visit this group at https://groups.google.com/group/cp2k.
>>>>>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> You received this message because you are subscribed to the
>>>>>>>>>> Google Groups "cp2k" group.
>>>>>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>>>>>> send an email to cp2k+uns... at googlegroups.com.
>>>>>>>>>> To post to this group, send email to cp... at googlegroups.com.
>>>>>>>>>> Visit this group at https://groups.google.com/group/cp2k.
>>>>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> You received this message because you are subscribed to a topic in
>>>>>>>>> the Google Groups "cp2k" group.
>>>>>>>>> To unsubscribe from this topic, visit
>>>>>>>>> https://groups.google.com/d/topic/cp2k/lVLso0oseHU/unsubscribe.
>>>>>>>>> To unsubscribe from this group and all its topics, send an email
>>>>>>>>> to cp2k+uns... at googlegroups.com.
>>>>>>>>> To post to this group, send email to cp... at googlegroups.com.
>>>>>>>>> Visit this group at https://groups.google.com/group/cp2k.
>>>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> You received this message because you are subscribed to the Google
>>>>>>>> Groups "cp2k" group.
>>>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>>>> send an email to cp2k+uns... at googlegroups.com.
>>>>>>>> To post to this group, send email to cp... at googlegroups.com.
>>>>>>>> Visit this group at https://groups.google.com/group/cp2k.
>>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> You received this message because you are subscribed to a topic in
>>>>>>> the Google Groups "cp2k" group.
>>>>>>> To unsubscribe from this topic, visit
>>>>>>> https://groups.google.com/d/topic/cp2k/lVLso0oseHU/unsubscribe.
>>>>>>> To unsubscribe from this group and all its topics, send an email to
>>>>>>> cp2k+uns... at googlegroups.com.
>>>>>>> To post to this group, send email to cp... at googlegroups.com.
>>>>>>> Visit this group at https://groups.google.com/group/cp2k.
>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>>
>>>>>>
>>>>>> --
>>>>>> You received this message because you are subscribed to the Google
>>>>>> Groups "cp2k" group.
>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>> send an email to cp2k+uns... at googlegroups.com.
>>>>>> To post to this group, send email to cp... at googlegroups.com.
>>>>>> Visit this group at https://groups.google.com/group/cp2k.
>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>
>>>>>
>>>>> --
>>>>> You received this message because you are subscribed to a topic in the
>>>>> Google Groups "cp2k" group.
>>>>> To unsubscribe from this topic, visit
>>>>> https://groups.google.com/d/topic/cp2k/lVLso0oseHU/unsubscribe.
>>>>> To unsubscribe from this group and all its topics, send an email to
>>>>> cp2k+uns... at googlegroups.com.
>>>>> To post to this group, send email to cp... at googlegroups.com.
>>>>> Visit this group at https://groups.google.com/group/cp2k.
>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>
>>>>
>>>> --
>>>> You received this message because you are subscribed to the Google
>>>> Groups "cp2k" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>> an email to cp2k+uns... at googlegroups.com.
>>>> To post to this group, send email to cp... at googlegroups.com.
>>>> Visit this group at https://groups.google.com/group/cp2k.
>>>> For more options, visit https://groups.google.com/d/optout.
>>>>
>>>
>>> --
>>> You received this message because you are subscribed to a topic in the
>>> Google Groups "cp2k" group.
>>> To unsubscribe from this topic, visit
>>> https://groups.google.com/d/topic/cp2k/lVLso0oseHU/unsubscribe.
>>> To unsubscribe from this group and all its topics, send an email to
>>> cp2k+uns... at googlegroups.com.
>>> To post to this group, send email to cp... at googlegroups.com.
>>> Visit this group at https://groups.google.com/group/cp2k.
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "cp2k" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to cp2k+uns... at googlegroups.com.
>> To post to this group, send email to cp... at googlegroups.com.
>> Visit this group at https://groups.google.com/group/cp2k.
>> For more options, visit https://groups.google.com/d/optout.
>>
>
> --
> You received this message because you are subscribed to a topic in the
> Google Groups "cp2k" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/cp2k/lVLso0oseHU/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> cp2k+uns... at googlegroups.com.
> To post to this group, send email to cp... at googlegroups.com.
> Visit this group at https://groups.google.com/group/cp2k.
> For more options, visit https://groups.google.com/d/optout.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.cp2k.org/archives/cp2k-user/attachments/20160323/61cc353a/attachment.htm>


More information about the CP2K-user mailing list