[CP2K:6642] Re: Compilations with Intel (XE 2013) for CP2K-trunk (2.7dev) & regtests errors

Alfio Lazzaro alfio.... at gmail.com
Mon Jun 15 19:22:56 UTC 2015


Hi Frank,
well, you can set a given affinity mask (by using KMP_AFFINITY if you are 
using the intel compiler) for a single job, while it doesn't work across 
multiple jobs... 
In any case, tiny phase is not so important to get the best result, but 
this is not the case for the small phase where you really want to use a 
single core per job.
Is there any specific reason why you cannot use parallel compilation on the 
cluster?

Alfio

Il giorno lunedì 15 giugno 2015 14:23:13 UTC+2, Frank Uhlig ha scritto:
>
> Hi Alfio, 
>
> I agree that it is not the best choice. I am running it anyhow right now 
> and will compare it to 'something more reasonable'.
> Because what I don't see, and maybe I am not involved in it enough, how 
> running it in the background on a sufficient number of processors will give 
> you different results from the hypothetical 
>
> ./generate -c ... -j 0 -t 40 tiny1
>
> for let's say a local machine with 40 cores (and if it would work).
>
> Best,
>
> Frank
>
> On Mon, Jun 15, 2015 at 2:06 PM, Alfio Lazzaro <alfi... at gmail.com 
> <javascript:>> wrote:
>
>> Hi Frank,
>> running the jobs in background is not a good idea, indeed they will run 
>> on the same node and clearly you will have wrong performance results from 
>> tiny phase (multiple jobs on the same core).
>>
>> I'm glad that my solution works (l can add it to the SVN repository).
>>
>> Best regards,
>>
>> Alfio
>>
>>
>>
>>
>> Il giorno lunedì 15 giugno 2015 10:19:47 UTC+2, Frank Uhlig ha scritto:
>>>
>>> Hi Rolf and Alfio,
>>>
>>> pity that the ulimit version did not work. Then it might be some other 
>>> limit, but I am not sure right now.
>>>
>>> I like Alfio's idea of the local submission. It runs in sequential and I 
>>> think the reason is the following. The for loop around line 127 in the 
>>> generate.bash file will always wait until the execution of the following 
>>> command in the run_make function (line 136)
>>>
>>> ${run_cmd} make -j ${ntasks} -f ../${make_file} ${target} 
>>> SI=${element_start} EI=${element_end}
>>>
>>> has finished and won't actually run in parallel. This is no problem with 
>>> queuing systems, because the submit command will always exit, but the job 
>>> will run (in the queuing system). This is not the case with the current 
>>> local version.
>>>
>>> So you could change the command to run in the background that seems to 
>>> work for me right now. Like the following:
>>>
>>> batch_cmd() {
>>> $@ &
>>> }
>>>
>>> Best,
>>>
>>> Frank
>>>
>>>
>>>
>>>
>>> On Sat, Jun 13, 2015 at 3:06 PM Alfio Lazzaro <alfi... at gmail.com> 
>>> wrote:
>>>
>>>> Hi Rolf,
>>>> the problem is that the list of arguments is too long (there are 13824 
>>>> entries, each one with at least 10 chars) 
>>>> Could you run the test in parallel? I mean you can use "-j #". In this 
>>>> case you have also to specify the wlm by using "-w " flag (pbs/slurm or a 
>>>> new one for your system). Please see the README for the parallel execution 
>>>> steps.
>>>> If you don't want to run on the cluster, you can still use the parallel 
>>>> execution on the login node, but then you have to declare an "empty" wlm 
>>>> file under config directory:
>>>>
>>>> > cat no.wlm
>>>> batch_cmd() {
>>>> $@
>>>> }
>>>>   
>>>> Therefore you can use "-j 100 -w no".
>>>> Note that I have never tried such a case, so I'm not sure it will work 
>>>> out-of-the-box. Let me how it goes.
>>>>
>>>> Cheers,
>>>>
>>>> Alfio
>>>>   
>>>>
>>>>
>>>> Il giorno venerdì 12 giugno 2015 20:26:47 UTC+2, Rolf David ha scritto:
>>>>>
>>>>> Hi,
>>>>>
>>>>> The advices worked well. I'm now error free.
>>>>>
>>>>> I've moved on compiling libsmm and libgrid. 
>>>>>
>>>>> Now i've run in another problem, the generation of libsmm.
>>>>>
>>>>> The tiny part : 
>>>>>
>>>>> ./generate -c config/linux.intel -j 0 -t 16 tiny1
>>>>>
>>>>>
>>>>> Generate master file output_linux.intel/tiny_find_1_1_1__24_24_24.f90 
>>>>>
>>>>> make: execvp: /bin/sh: Argument list too long 
>>>>>
>>>>> make: *** [output_linux.intel/tiny_find_1_1_1__24_24_24.f90] Error 127
>>>>>
>>>>>
>>>>>
>>>>> The problem comes from the command by verbosing:
>>>>>
>>>>>
>>>>> make -j 16 -f ../Makefile.tiny_dnn_linux.intel all SI=1 EI=13824
>>>>>
>>>>>
>>>>> and ifort died 
>>>>>
>>>>>
>>>>> if anobody has any idea of what I'm doing wrong, I'll be glad (and 
>>>>> relieved !)
>>>>>
>>>>>
>>>>> Kinds regards
>>>>>
>>>>>
>>>>> Rolf
>>>>>
>>>>  -- 
>>>> You received this message because you are subscribed to the Google 
>>>> Groups "cp2k" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>> an email to cp2k+... at googlegroups.com.
>>>> To post to this group, send email to cp... at googlegroups.com.
>>>> Visit this group at http://groups.google.com/group/cp2k.
>>>> For more options, visit https://groups.google.com/d/optout.
>>>>
>>>  -- 
>> You received this message because you are subscribed to the Google Groups 
>> "cp2k" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to cp2k+... at googlegroups.com <javascript:>.
>> To post to this group, send email to cp... at googlegroups.com <javascript:>
>> .
>> Visit this group at http://groups.google.com/group/cp2k.
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.cp2k.org/archives/cp2k-user/attachments/20150615/ecc025ac/attachment.htm>


More information about the CP2K-user mailing list