[CP2K-user] [CP2K:11011] ASE CP2K interface

Geng Sun sungen... at gmail.com
Wed Dec 5 00:29:58 UTC 2018


Hello Ole,

I made another change to the cp2k_shell.F: If the code reads individual 
positions rather than read them as a whole (like the code below), the 
program will hang at a point where only part of the positions are read in. 
(I have about 331 atoms, where normally about 250~270 atoms are read) 
           !READ (*,*,iostat=iostat) pos
           !READ (*,*,iostat=iostat) ((pos(i,j),j=1,3),i=1,n_atom)
           !READ (*,*,iostat=iostat) (pos(i),i=1,n_atom2)
           DO i=1,n_atom
              m=(i-1)*3+1
              n=(i-1)*3+3
              READ(*,*) (pos(j),j=m,n)
              WRITE(0,*) "atom=", i, pos(m),pos(m+1),pos(n)
              CALL m_flush(0)
           END DO
           WRITE(0,*) "LABEL-1"
           CALL m_flush(0)


This is the tail of the standard error output. Where, only 252 atoms are 
printed.
Sending: 1.576752302496963409e+01 5.503432604574527431e+00 
5.202333832524260515e+00
 atom=         243   16.662443015797379        1.3742961283890502        
3.8763624242357171
Sending: 1.517292733648857705e+01 4.201499193055327375e+00 
7.381004145064225419e+00
 atom=         244   16.662391280884478        1.3742459215953395       
0.65239139651875000
Sending: 1.347251652964376234e+01 6.982201222299000420e+00 
7.885207051660287902e-01
Sending: 1.270684522088338753e+01 5.387934046501004381e+00 
7.380386770425433340e+00
 atom=         245   16.705918168029598        1.3963594659850651        
6.5096981880927025
Sending: 1.121215876405216250e+01 5.455761976608218156e+00 
7.885247030917259536e-01
Sending: 1.264981900140291593e+01 5.588193026760487570e+00 
2.971586044180265063e+00
 atom=         246   14.298521645540390        2.7461084273618193        
7.4746197668438166
Sending: 1.340077937297150790e+01 4.053634960891928429e+00 
7.885301196183434058e-01
Sending: 1.020485628515282706e+01 6.913382900918708884e+00 
7.383907976352638514e+00
 atom=         247   14.282063640676757        2.7485874649238795        
1.6804517810327422
Sending: 1.041632697503036731e+01 6.865228814815203862e+00 
2.971592361475913435e+00
 atom=         248   14.282106984202624        2.7485910114075263        
4.2975637697642934
Sending: 9.521439090500260605e+00 8.245776638331120623e+00 
6.218688347483974255e+00
Sending: 9.521360909499742675e+00 8.245771920855188952e+00 
1.955237846516025613e+00
 atom=         249   15.019369986694146        4.0380913886689607        
2.9715839946517644
Sending: 1.190174301579737914e+01 9.620070407982204586e+00 
3.876362424235717086e+00
Sending: 1.190169128088448147e+01 9.620020201188493658e+00 
6.523913965187499997e-01
 atom=         250   15.914380998597084        2.6575812528326686        
5.2023401498199107
Sending: 1.189999109458993409e+01 9.622398083668830537e+00 
6.496868766581632926e+00
 atom=         251   15.925180013305853        8.4795751127617489E-002   
5.2023421993484060
Sending: 9.516001797608472756e+00 1.099443356384864323e+01 
7.482068722341200129e+00
 atom=         252   18.147873024969634        1.3805454647779496        
5.2023338325242605
Sending: 9.521363640676758777e+00 1.099436174451703430e+01 
1.680451781032742176e+00
Sending: 9.521406984202624102e+00 1.099436529100068149e+01 
4.297563769764293440e+00
Sending: 1.025866998669414798e+01 1.228386566826211634e+01 
2.971583994651764371e+00
Sending: 1.115368099859708551e+01 1.090335553242582378e+01 
5.202340149819910664e+00
Sending: 1.116448001330585527e+01 8.330570030720771513e+00 
5.202342199348406027e+00
Sending: 1.338717302496963768e+01 9.626319744371103937e+00 
5.202333832524260515e+00
Sending: 1.278402737896498920e+01 8.314253003861532321e+00 
7.379054099830940849e+00
Sending: 1.109216652964376415e+01 1.110508836209557870e+01 
7.885207051660287902e-01
Sending: 1.032654594512574420e+01 9.507815609387757050e+00 
7.379945316851912018e+00
Sending: 8.831808764052164307e+00 9.578649116404795549e+00 
7.885247030917259536e-01
Sending: 1.026946900140291774e+01 9.711080166557064075e+00 
2.971586044180265063e+00
Sending: 1.102042937297150971e+01 8.176522100688506711e+00 
7.885301196183434058e-01
Sending: 7.823467798454666777e+00 1.103241626738852688e+01 
7.382163504927960140e+00
Sending: 8.035976975030369118e+00 1.098811595461178214e+01 
2.971592361475913435e+00
Sending: 7.141089090500260639e+00 1.236866377812769713e+01 
6.218688347483974255e+00
Sending: 7.141010909499744486e+00 1.236865906065176546e+01 
1.955237846516025613e+00
Sending: 9.521393015797379178e+00 1.374295754777878109e+01 
3.876362424235717086e+00
Sending: 9.521341280884481506e+00 1.374290734098507016e+01 
6.523913965187499997e-01
Sending: 9.520851532832354636e+00 1.374528395598669661e+01 
6.498168429193992068e+00
Sending: 7.135752205372321910e+00 1.512061403636860391e+01 
7.467937815653423961e+00
Sending: 7.141013640676759699e+00 1.511724888431361080e+01 
1.680451781032742176e+00
Sending: 7.141056984202625912e+00 1.511725243079725800e+01 
4.297563769764293440e+00
Sending: 7.878319986694148902e+00 1.640675280805869107e+01 
2.971583994651764371e+00
Sending: 8.773330998597087316e+00 1.502624267222240029e+01 
5.202340149819910664e+00
Sending: 8.784130013305857076e+00 1.245345717051734802e+01 
5.202342199348406027e+00
Sending: 1.100682302496963771e+01 1.374920688416768044e+01 
5.202333832524260515e+00
Sending: 1.040246635779289264e+01 1.243375815304947274e+01 
7.378511014138936730e+00
Sending: 8.711816529643764184e+00 1.522797550189215521e+01 
7.885207051660287902e-01
Sending: 7.944196067620286072e+00 1.363073238988811653e+01 
7.376840927591756802e+00
Sending: 6.451458764052164341e+00 1.370153625620137205e+01 
7.885247030917259536e-01
Sending: 7.889119001402919551e+00 1.383396730635364058e+01 
2.971586044180265063e+00
Sending: 8.640079372971511518e+00 1.229940924048508322e+01 
7.885301196183434058e-01
Sending: 5.442845882770557253e+00 1.515594640055591036e+01 
7.382285524632366425e+00
Sending: 5.655626975030369152e+00 1.511100309440835865e+01 
2.971592361475913435e+00
Sending: 2.438731539757682254e-01 5.399653964569083087e+00 
9.216051601394250170e+00
Sending: 4.943799047278442771e+00 6.359320929803981670e+00 
1.078113524315581451e+01
Sending: 3.138843358305583919e+00 3.495089033159221259e+00 
9.827428649731281496e+00
Sending: 5.783276526942270124e+00 4.240744460487060330e+00 
9.597502013642966290e+00
Sending: 3.616494458058998607e+00 8.325856048791498765e+00 
9.585425118565524372e+00
Sending: 3.357219536482951128e+00 7.904624946018040887e+00 
1.205591374980337882e+01
Sending: 6.323261420949900513e-01 3.175684482035930234e+00 
1.034222361512076382e+01
Sending: 2.466428571964844885e+00 5.952425260037331967e+00 
1.043216353271719576e+01
Sending: -2.305837893097886504e-01 6.188097724220211759e+00 
1.046543155157912608e+01
Sending: 3.399261249766690085e+00 3.516835622123011262e+00 
1.133940075852044416e+01
Sending: 5.923947421334220920e+00 4.113983946240092671e+00 
1.115507609006755807e+01
Sending: *END


Geng


在 2018年12月4日星期二 UTC-8下午1:20:19,Geng Sun写道:
>
> Hello Ole,
>
> Yes, I am using MPI. I am running CP2K on cori, which is  a Cray system.
> I am not very sure, but it seems that the MPI is mpich, because the 
> cray-mpich module is loaded when I run CP2K.
>
> I have 331 atoms in the calculations. I tried to reduce the position string
> self._shell.send('%.18e %.18e %.18e' % tuple(pos))
> to 
> self._shell.send('%.8e %.8e %.8e' % tuple(pos))
>
> I assumed this may avoid possible deadlock of the PIPE, but it does not 
> work in the end.
>
> Best
>
> Geng
>
>
> 在 2018年12月4日星期二 UTC-8下午12:59:20,Ole Schütt写道:
>>
>> Hi Geng, 
>>
>> are you using MPI? Then this is probably where the buffering happens. 
>> Depending on which MPI implementation you are using there might be a way 
>> to tweak its stdin/out forwarding. 
>>
>> Out of curiosity, how many atoms does your system have? Maybe the 
>> Fortran side simple tries to read too many values? 
>>
>> -Ole 
>>
>>
>> On 2018-12-04 21:27, Geng Sun wrote: 
>> > Hello Ole, 
>> > 
>> > Thank you very much for your reply: 
>> > I changed the code as you suggested (below), but the problem is still 
>> > present in the test. 
>> > 
>> > Best 
>> > 
>> > Geng 
>> > 
>> >     def send(self, line): 
>> >         """Send a line to the cp2k_shell""" 
>> >         assert self._child.poll() is None  # child process still 
>> > alive? 
>> >         if self._debug: 
>> >             #print('Sending: ' + line) 
>> >             sys.stderr.write("Sending: {}\n".format(line)) 
>> >             sys.stderr.flush() 
>> > 
>> >         if self.version < 2.1 and len(line) >= 80: 
>> >             raise Exception('Buffer overflow, upgrade CP2K to r16779 
>> > or later') 
>> >         assert(len(line) < 800)  # new input buffer size 
>> >         self.isready = False 
>> >         self._child.stdin.write(line + '\n') 
>> >         self._child.stdin.flush() 
>> > 
>> > 在 2018年12月4日星期二 UTC-8上午11:05:43,Ole 
>> > Schütt写道: 
>> > 
>> >> Hi Geng, 
>> >> 
>> >> this sounds indeed like a buffering issue. Could you once try to add 
>> >> a 
>> >> flush() on the python side. 
>> >> 
>> >> Basically add the following line after cp2k.py:498. 
>> >> 
>> >> self._child.stdin.write(line + '\n') 
>> >> +       self._child.stdin.flush() 
>> >> 
>> >> This is probably quite inefficient. So, if it works I'll add some 
>> >> logic 
>> >> to flush only when recv() follows a send(). 
>> >> 
>> >> -Ole 
>> >> 
>> >> On 2018-12-04 18:08, Geng Sun wrote: 
>> >>> Dear CP2K users, 
>> >>> 
>> >>> In the past several weeks, I frequently faced a problem when I 
>> >> use 
>> >>> the CP2K -ASE interface. 
>> >>> 
>> >>> The calculations frequently got stuck during the calculations. 
>> >>> 
>> >>> 1) Firstly I switched on the debug=True option in the ASE-CP2K 
>> >>> calculator and I found that the calculation always gets stuck at a 
>> >> 
>> >>> line with *END after sending the positions to the subroutine 
>> >>> cp2k_shell.popt  (I printed the information to the standard error, 
>> >> so 
>> >>> they are not buffered) 
>> >>> 
>> >>> 2) Then, I modified the cp2k_shell.F to print a lot of "labels". 
>> >> Then 
>> >>> I found that the code may get stuck at the line of "READ 
>> >>> (*,*,iostat=iostat) pos" like below. I can always get "begin to 
>> >> read 
>> >>> pos" in the standard error, but I can not reach "LABEL-1". 
>> >>> 
>> >>> WRITE(0,*) "begin to read pos" 
>> >>> CALL m_flush(0) 
>> >>> READ (*,*,iostat=iostat) pos 
>> >>> WRITE(0,*) "LABEL-1" 
>> >>> CALL m_flush(0) 
>> >>> 
>> >>> I attached the modified cp2k_shell.F and the standard out/error 
>> >>> generated by the slurm batch system with this post. You can see 
>> >> that 
>> >>> the program get stuck at the second optimization step just after 
>> >>> sending the positions. 
>> >>> 
>> >>> I greatly appreciate your suggestions for fixing this confusing 
>> >> bugs. 
>> >>> 
>> >>> Thanks in advance. 
>> >>> 
>> >>> Geng 
>> >>> 
>> >>> -- 
>> >>> You received this message because you are subscribed to the Google 
>> >> 
>> >>> Groups "cp2k" group. 
>> >>> To unsubscribe from this group and stop receiving emails from it, 
>> >> send 
>> >>> an email to cp2k+... at googlegroups.com. 
>> >>> To post to this group, send email to cp... at googlegroups.com. 
>> >>> Visit this group at https://groups.google.com/group/cp2k [1]. 
>> >>> For more options, visit https://groups.google.com/d/optout [2]. 
>> > 
>> >  -- 
>> > You received this message because you are subscribed to the Google 
>> > Groups "cp2k" group. 
>> > To unsubscribe from this group and stop receiving emails from it, send 
>> > an email to cp2k+... at googlegroups.com. 
>> > To post to this group, send email to cp... at googlegroups.com. 
>> > Visit this group at https://groups.google.com/group/cp2k. 
>> > For more options, visit https://groups.google.com/d/optout. 
>> > 
>> > 
>> > Links: 
>> > ------ 
>> > [1] https://groups.google.com/group/cp2k 
>> > [2] https://groups.google.com/d/optout 
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.cp2k.org/archives/cp2k-user/attachments/20181204/177490ce/attachment.htm>


More information about the CP2K-user mailing list