Multi-threading not actually working on another machine

William Tao ywta... at gmail.com
Mon Mar 6 16:36:03 UTC 2017


Thank you Alfio for your suggestion, I will try it in the way you advised.

在 2017年2月25日星期六 UTC-6上午12:26:40,Alfio Lazzaro写道:
>
> The CP2K output is fine. 
> Now, if you rely on top command to see how many threads you are running, 
> that is somehow underestimated (it will show you the average CPU 
> utilization). You can use the options in the second answer of this post  
> http://stackoverflow.com/questions/15933801/openmp-creates-many-threads-but-seems-to-use-only-one-core 
> to disaggregate all the running threads.
> Another test is to set less number of threads (for example 
> OMP_NUM_THREADS=4) and see how it goes.
> In any case, as I mentioned in my previous email, for such large number of 
> cores you should consider to run the PSMP version, i.e. MPI+OpenMP (your 40 
> cores are using different NUMA domains, so you should set at least one MPI 
> rank per NUMA domain).
>
> Alfio
>
> Il giorno venerdì 24 febbraio 2017 17:06:49 UTC+1, William Tao ha scritto:
>>
>> Dear Alfio,
>>
>> I did set up the OMP_NUM_THREADS as 40 for my calculation. And here is 
>> part of my output in the very beginning.
>>
>>  GLOBAL| Force Environment number                                         
>>      1
>>  GLOBAL| Basis set file name                                           
>> HFX_BASIS
>>  GLOBAL| Potential file name                                     
>>  GTH_POTENTIALS
>>  GLOBAL| MM Potential file name                                     
>> MM_POTENTIAL
>>  GLOBAL| Coordinate file name                                         
>> atm309.xyz
>>  GLOBAL| Method name                                                     
>>    CP2K
>>  GLOBAL| Project name                                             ATOM-309
>>  GLOBAL| Preferred FFT library                                           
>>   FFTW3
>>  GLOBAL| Preferred diagonalization lib.                                   
>>     SL
>>  GLOBAL| Run type                                                         
>>     MD
>>  GLOBAL| All-to-all communication in single precision                     
>>      F
>>  GLOBAL| FFTs using library dependent lengths                             
>>      F
>>  GLOBAL| Global print level                                               
>>    LOW
>>  GLOBAL| Total number of message passing processes                       
>>       1
>> * GLOBAL| Number of threads for this process                             
>>       40*
>>  GLOBAL| This output is from process                                     
>>       0
>>  GLOBAL| CPU model name :  Intel(R) Xeon(R) CPU E7-8891 v3 @ 2.80GHz
>>
>>
>> However, when running, it could only use up to 8 threads.
>>
>> Thank you.
>>
>> William
>>
>>
>>
>> 在 2017年2月24日星期五 UTC-6上午1:47:37,Alfio Lazzaro写道:
>>>
>>> Dear William,
>>> CP2K uses OpenMP for the threading. The number of threads is set by 
>>> using the environment variable OMP_NUM_THREADS and it goes at runtime, 
>>> therefore there is no relation with the compilation (you can specify the 
>>> number of threads at runtime). By default, if you don't set the variable, 
>>> OpenMP assumes the maximum number of available threads on the system (8 and 
>>> 40 in your case). That's why you see 40 threads. However, you should 
>>> consider that it is very hard to get good scalability for such large number 
>>> of threads (actually it depends on your workload), that's why in "average" 
>>> (what you see from top command) you are using 8 corresponding fully loaded 
>>> threads.
>>>
>>> I can suggest to experiment with setting a different number of threads. 
>>> Just use:
>>>
>>> export OMP_NUM_THREADS=<any number between 1 and 40>
>>>
>>> before running CP2K.
>>>
>>> Another better solution would be to use MPI and OpenMP (psmp version), 
>>> by using 4 MPI ranks and 10 threads (for example). Likely it would give you 
>>> better performance...
>>>
>>> Alfio
>>>
>>>
>>>
>>> Il giorno giovedì 23 febbraio 2017 22:48:01 UTC+1, William Tao ha 
>>> scritto:
>>>>
>>>> Dear friends,
>>>>
>>>> I compiled the cp2k version 4.1 on a 8-core CPU machine.
>>>> When I run the binary cp2k.ssmp on another machine with 40-core CPU, 
>>>> the output prints that the number of threads is 40.
>>>> However, when I check with "top" command, the process could use up to 
>>>> 800% CPU.
>>>>
>>>> Does anybody know what is going on?
>>>>
>>>>
>>>> William 
>>>>
>>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.cp2k.org/archives/cp2k-user/attachments/20170306/baad7439/attachment.htm>


More information about the CP2K-user mailing list