Multi-threading not actually working on another machine
Alfio Lazzaro
alfio.... at gmail.com
Sat Feb 25 06:26:40 UTC 2017
The CP2K output is fine.
Now, if you rely on top command to see how many threads you are running,
that is somehow underestimated (it will show you the average CPU
utilization). You can use the options in the second answer of this post
http://stackoverflow.com/questions/15933801/openmp-creates-many-threads-but-seems-to-use-only-one-core
to disaggregate all the running threads.
Another test is to set less number of threads (for example
OMP_NUM_THREADS=4) and see how it goes.
In any case, as I mentioned in my previous email, for such large number of
cores you should consider to run the PSMP version, i.e. MPI+OpenMP (your 40
cores are using different NUMA domains, so you should set at least one MPI
rank per NUMA domain).
Alfio
Il giorno venerdì 24 febbraio 2017 17:06:49 UTC+1, William Tao ha scritto:
>
> Dear Alfio,
>
> I did set up the OMP_NUM_THREADS as 40 for my calculation. And here is
> part of my output in the very beginning.
>
> GLOBAL| Force Environment number
> 1
> GLOBAL| Basis set file name
> HFX_BASIS
> GLOBAL| Potential file name
> GTH_POTENTIALS
> GLOBAL| MM Potential file name
> MM_POTENTIAL
> GLOBAL| Coordinate file name
> atm309.xyz
> GLOBAL| Method name
> CP2K
> GLOBAL| Project name ATOM-309
> GLOBAL| Preferred FFT library
> FFTW3
> GLOBAL| Preferred diagonalization lib.
> SL
> GLOBAL| Run type
> MD
> GLOBAL| All-to-all communication in single precision
> F
> GLOBAL| FFTs using library dependent lengths
> F
> GLOBAL| Global print level
> LOW
> GLOBAL| Total number of message passing processes
> 1
> * GLOBAL| Number of threads for this process
> 40*
> GLOBAL| This output is from process
> 0
> GLOBAL| CPU model name : Intel(R) Xeon(R) CPU E7-8891 v3 @ 2.80GHz
>
>
> However, when running, it could only use up to 8 threads.
>
> Thank you.
>
> William
>
>
>
> 在 2017年2月24日星期五 UTC-6上午1:47:37,Alfio Lazzaro写道:
>>
>> Dear William,
>> CP2K uses OpenMP for the threading. The number of threads is set by using
>> the environment variable OMP_NUM_THREADS and it goes at runtime, therefore
>> there is no relation with the compilation (you can specify the number of
>> threads at runtime). By default, if you don't set the variable, OpenMP
>> assumes the maximum number of available threads on the system (8 and 40 in
>> your case). That's why you see 40 threads. However, you should consider
>> that it is very hard to get good scalability for such large number of
>> threads (actually it depends on your workload), that's why in "average"
>> (what you see from top command) you are using 8 corresponding fully loaded
>> threads.
>>
>> I can suggest to experiment with setting a different number of threads.
>> Just use:
>>
>> export OMP_NUM_THREADS=<any number between 1 and 40>
>>
>> before running CP2K.
>>
>> Another better solution would be to use MPI and OpenMP (psmp version), by
>> using 4 MPI ranks and 10 threads (for example). Likely it would give you
>> better performance...
>>
>> Alfio
>>
>>
>>
>> Il giorno giovedì 23 febbraio 2017 22:48:01 UTC+1, William Tao ha scritto:
>>>
>>> Dear friends,
>>>
>>> I compiled the cp2k version 4.1 on a 8-core CPU machine.
>>> When I run the binary cp2k.ssmp on another machine with 40-core CPU, the
>>> output prints that the number of threads is 40.
>>> However, when I check with "top" command, the process could use up to
>>> 800% CPU.
>>>
>>> Does anybody know what is going on?
>>>
>>>
>>> William
>>>
>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.cp2k.org/archives/cp2k-user/attachments/20170224/0ff3f4ad/attachment.htm>
More information about the CP2K-user
mailing list