[CP2K:10115] Re: cp2k hangs with large system

Alfio Lazzaro alfio.... at gmail.com
Wed Mar 21 22:10:48 UTC 2018


Dear Fernan,
Thanks for the new files.
OK, let me summarize:
1) The GNU+OpenMPI version of the code is terribly slow
2) The Intel (MPI + MKL) version hangs

Now, let's start with the first problem. I took your input file and 
executed on my system with 24 MPI ranks and 1 OpenMP thread. I'm using GNU 
5.3 + OpenMPI without any special optimizations/libraries.  The first thing 
I noticed is that the initial values are different (w.r.t log.out_openmpi):

Yours:
TOTAL NUMBERS AND MAXIMUM NUMBERS

  Total number of            - Atomic kinds:                                
   4
                             - Atoms:                                      
  950
                             - Shell sets:                                  
1900
                             - Shells:                                      
4094
                             - Primitive Cartesian functions:              
 4750
                             - Cartesian basis functions:                  
10348
                             - Spherical basis functions:                  
 9726

  Maximum angular momentum of- Orbital basis functions:                    
    2
                             - Local part of the GTH pseudopotential:      
    2
                             - Non-local part of the GTH pseudopotential:  
    2

Mine:
 TOTAL NUMBERS AND MAXIMUM NUMBERS

  Total number of            - Atomic kinds:                                
   3
                             - Atoms:                                      
  938
                             - Shell sets:                                  
1876
                             - Shells:                                      
3434
                             - Primitive Cartesian functions:              
 4690
                             - Cartesian basis functions:                  
 7480
                             - Spherical basis functions:                  
 7170

  Maximum angular momentum of- Orbital basis functions:                    
    2
                             - Local part of the GTH pseudopotential:      
    2
                             - Non-local part of the GTH pseudopotential:  
    0

Checking with your log.out, I see that you are using CP2K 4.1 and the 
output is different, but I see that some values are equal to mine, for 
instance:


Number of electrons:                                                      
 2168
 Number of occupied orbitals:                                              
 1084
 Number of molecular orbitals:                                              
1084

 Number of orbital functions:                                              
 7170
 Number of independent orbital functions:                                  
 7170

These values are very important for the performance. For the GNU+OpenMPI 
they are bigger, that means that you can expect slower performance there.

At this point, I strongly suggest to run the regtests to check your 
installation. Make sure 
Then, I can suggest you run a smaller test (you can take some a test under 
tests/QS/benchmark/H2O-32.inp) and run with a single rank, so that you can 
do a fast comparison without MPI. If this is reasonable, then you can move 
to more ranks. 
Another suggestion is to check how many cores are really involved during 
the execution (you can use htop).

Alfio




Il giorno mercoledì 21 marzo 2018 14:22:54 UTC+1, Fernan Saiz ha scritto:
>
> Dear Alfio,
> 2) The log.out file was written using the Intel Compilers (MPI and MKL). 
> Please find attached file a new log.out_openmpi, which was built with 
> libxsmm where I show the times that are ridiculous long. When the cp2k 
> version compiled with Intel runs fine, these times are between 6 and 10 
> seconds, whereas with openmpi they are between 28 and 76 seconds, even if I 
> increase the number of cores.
>
> I also attached my arch file used for openmpi.
>
> Best regards,
>
> - Fernan
>
> On Wed, Mar 21, 2018 at 10:57 AM, Alfio Lazzaro <alfi... at gmail.com 
> <javascript:>> wrote:
>
>> Dear Fernan,
>> we are working on making a list of supported compilers for CP2K (see 
>> https://www.cp2k.org/dev:compiler_support ).
>> Now, I'm confused about your findings:
>>
>> 1) "CRAY compilers on the UK's ARCHER supercomputer" ==> Do they really 
>> compile CP2K with the Cray CCE compiler? This is somehow surprising to me 
>> since we found that CCE is broken...
>>
>> 2) "I have compiled cp2k with gcc and openmpi modules, but my code is 
>> much slower than that built with Intel Compilers" ==> This is strange too. 
>> I see that you are not using libxsmm, i.e.
>>
>> DBCSR| Multiplication driver 
>>                                               BLAS
>>
>> Libxsmm will give a much better performance. Where is BLAS coming from? 
>> MKL for GCC and Intel compilations? Could you elaborate a bit more on this 
>> comparison?
>>
>> 3) Before any production run, it would be a good idea to test the CP2K 
>> installation. Have you executed the regtests? (
>> https://www.cp2k.org/dev:regtesting )
>>
>> Best regards,
>>
>> Alfio
>>
>>
>> Il giorno mercoledì 21 marzo 2018 03:31:29 UTC+1, Wei Lai ha scritto:
>>>
>>> Were you using Intel compilers with OpenMPI or MVAPICH?  I had the same 
>>> issue.  Switching to Intel MPI solved the problem.
>>>
>>> Wei
>>>
>>> On Monday, March 19, 2018 at 2:35:35 PM UTC-4, Fernan Saiz wrote:
>>>>
>>>> Dear All,
>>>> I have been experiencing a problem with cp2k for large systems of 
>>>> around 1,000 atoms, in which this code prints no data at the beginning of 
>>>> SCF cycle as shown in the attached file log.out. I am using a cp2k version 
>>>> compiled with Intel compilers 2017 at my institution's HPC systems. It is 
>>>> strange to me that sometimes cp2k runs fine with a different set of nuclei 
>>>> positions while keeping untouched the rest of the parameters. However, I 
>>>> need to restart my run (see ape-water.inp) to continue with the NVT 
>>>> simulation for several ps. It also strikes me that I have never faced this 
>>>> problem when using a version built with CRAY compilers on the UK's ARCHER 
>>>> supercomputer. As an alternative, I have compiled cp2k with gcc and openmpi 
>>>> modules (I received your helped in this forum a few days ago for this 
>>>> compilation), but my code is much slower than that built with Intel 
>>>> Compilers, which makes it not suitable for my purposes. I have also played 
>>>> with the OpenMP vs MPI load in the PBS file, but I got no luck. Therefore, 
>>>> I would really appreciate some advice on how to correct this problem, if 
>>>> possible.
>>>>
>>>> Best regards,
>>>>  - Fernan Saiz, PhD
>>>> Department of Chemistry
>>>> Imperial College London
>>>>
>>> -- 
>> You received this message because you are subscribed to a topic in the 
>> Google Groups "cp2k" group.
>> To unsubscribe from this topic, visit 
>> https://groups.google.com/d/topic/cp2k/XzwfDfJtW08/unsubscribe.
>> To unsubscribe from this group and all its topics, send an email to 
>> cp2k+... at googlegroups.com <javascript:>.
>> To post to this group, send email to cp... at googlegroups.com <javascript:>
>> .
>> Visit this group at https://groups.google.com/group/cp2k.
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.cp2k.org/archives/cp2k-user/attachments/20180321/c016dbfd/attachment.htm>


More information about the CP2K-user mailing list