Multi-threading not actually working on another machine
William Tao
ywta... at gmail.com
Mon Mar 6 16:36:03 UTC 2017
Thank you Alfio for your suggestion, I will try it in the way you advised.
在 2017年2月25日星期六 UTC-6上午12:26:40,Alfio Lazzaro写道:
>
> The CP2K output is fine.
> Now, if you rely on top command to see how many threads you are running,
> that is somehow underestimated (it will show you the average CPU
> utilization). You can use the options in the second answer of this post
> http://stackoverflow.com/questions/15933801/openmp-creates-many-threads-but-seems-to-use-only-one-core
> to disaggregate all the running threads.
> Another test is to set less number of threads (for example
> OMP_NUM_THREADS=4) and see how it goes.
> In any case, as I mentioned in my previous email, for such large number of
> cores you should consider to run the PSMP version, i.e. MPI+OpenMP (your 40
> cores are using different NUMA domains, so you should set at least one MPI
> rank per NUMA domain).
>
> Alfio
>
> Il giorno venerdì 24 febbraio 2017 17:06:49 UTC+1, William Tao ha scritto:
>>
>> Dear Alfio,
>>
>> I did set up the OMP_NUM_THREADS as 40 for my calculation. And here is
>> part of my output in the very beginning.
>>
>> GLOBAL| Force Environment number
>> 1
>> GLOBAL| Basis set file name
>> HFX_BASIS
>> GLOBAL| Potential file name
>> GTH_POTENTIALS
>> GLOBAL| MM Potential file name
>> MM_POTENTIAL
>> GLOBAL| Coordinate file name
>> atm309.xyz
>> GLOBAL| Method name
>> CP2K
>> GLOBAL| Project name ATOM-309
>> GLOBAL| Preferred FFT library
>> FFTW3
>> GLOBAL| Preferred diagonalization lib.
>> SL
>> GLOBAL| Run type
>> MD
>> GLOBAL| All-to-all communication in single precision
>> F
>> GLOBAL| FFTs using library dependent lengths
>> F
>> GLOBAL| Global print level
>> LOW
>> GLOBAL| Total number of message passing processes
>> 1
>> * GLOBAL| Number of threads for this process
>> 40*
>> GLOBAL| This output is from process
>> 0
>> GLOBAL| CPU model name : Intel(R) Xeon(R) CPU E7-8891 v3 @ 2.80GHz
>>
>>
>> However, when running, it could only use up to 8 threads.
>>
>> Thank you.
>>
>> William
>>
>>
>>
>> 在 2017年2月24日星期五 UTC-6上午1:47:37,Alfio Lazzaro写道:
>>>
>>> Dear William,
>>> CP2K uses OpenMP for the threading. The number of threads is set by
>>> using the environment variable OMP_NUM_THREADS and it goes at runtime,
>>> therefore there is no relation with the compilation (you can specify the
>>> number of threads at runtime). By default, if you don't set the variable,
>>> OpenMP assumes the maximum number of available threads on the system (8 and
>>> 40 in your case). That's why you see 40 threads. However, you should
>>> consider that it is very hard to get good scalability for such large number
>>> of threads (actually it depends on your workload), that's why in "average"
>>> (what you see from top command) you are using 8 corresponding fully loaded
>>> threads.
>>>
>>> I can suggest to experiment with setting a different number of threads.
>>> Just use:
>>>
>>> export OMP_NUM_THREADS=<any number between 1 and 40>
>>>
>>> before running CP2K.
>>>
>>> Another better solution would be to use MPI and OpenMP (psmp version),
>>> by using 4 MPI ranks and 10 threads (for example). Likely it would give you
>>> better performance...
>>>
>>> Alfio
>>>
>>>
>>>
>>> Il giorno giovedì 23 febbraio 2017 22:48:01 UTC+1, William Tao ha
>>> scritto:
>>>>
>>>> Dear friends,
>>>>
>>>> I compiled the cp2k version 4.1 on a 8-core CPU machine.
>>>> When I run the binary cp2k.ssmp on another machine with 40-core CPU,
>>>> the output prints that the number of threads is 40.
>>>> However, when I check with "top" command, the process could use up to
>>>> 800% CPU.
>>>>
>>>> Does anybody know what is going on?
>>>>
>>>>
>>>> William
>>>>
>>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.cp2k.org/archives/cp2k-user/attachments/20170306/baad7439/attachment.htm>
More information about the CP2K-user
mailing list