[CP2K-user] [CP2K:11011] ASE CP2K interface
Geng Sun
sungen... at gmail.com
Wed Dec 5 00:29:58 UTC 2018
Hello Ole,
I made another change to the cp2k_shell.F: If the code reads individual
positions rather than read them as a whole (like the code below), the
program will hang at a point where only part of the positions are read in.
(I have about 331 atoms, where normally about 250~270 atoms are read)
!READ (*,*,iostat=iostat) pos
!READ (*,*,iostat=iostat) ((pos(i,j),j=1,3),i=1,n_atom)
!READ (*,*,iostat=iostat) (pos(i),i=1,n_atom2)
DO i=1,n_atom
m=(i-1)*3+1
n=(i-1)*3+3
READ(*,*) (pos(j),j=m,n)
WRITE(0,*) "atom=", i, pos(m),pos(m+1),pos(n)
CALL m_flush(0)
END DO
WRITE(0,*) "LABEL-1"
CALL m_flush(0)
This is the tail of the standard error output. Where, only 252 atoms are
printed.
Sending: 1.576752302496963409e+01 5.503432604574527431e+00
5.202333832524260515e+00
atom= 243 16.662443015797379 1.3742961283890502
3.8763624242357171
Sending: 1.517292733648857705e+01 4.201499193055327375e+00
7.381004145064225419e+00
atom= 244 16.662391280884478 1.3742459215953395
0.65239139651875000
Sending: 1.347251652964376234e+01 6.982201222299000420e+00
7.885207051660287902e-01
Sending: 1.270684522088338753e+01 5.387934046501004381e+00
7.380386770425433340e+00
atom= 245 16.705918168029598 1.3963594659850651
6.5096981880927025
Sending: 1.121215876405216250e+01 5.455761976608218156e+00
7.885247030917259536e-01
Sending: 1.264981900140291593e+01 5.588193026760487570e+00
2.971586044180265063e+00
atom= 246 14.298521645540390 2.7461084273618193
7.4746197668438166
Sending: 1.340077937297150790e+01 4.053634960891928429e+00
7.885301196183434058e-01
Sending: 1.020485628515282706e+01 6.913382900918708884e+00
7.383907976352638514e+00
atom= 247 14.282063640676757 2.7485874649238795
1.6804517810327422
Sending: 1.041632697503036731e+01 6.865228814815203862e+00
2.971592361475913435e+00
atom= 248 14.282106984202624 2.7485910114075263
4.2975637697642934
Sending: 9.521439090500260605e+00 8.245776638331120623e+00
6.218688347483974255e+00
Sending: 9.521360909499742675e+00 8.245771920855188952e+00
1.955237846516025613e+00
atom= 249 15.019369986694146 4.0380913886689607
2.9715839946517644
Sending: 1.190174301579737914e+01 9.620070407982204586e+00
3.876362424235717086e+00
Sending: 1.190169128088448147e+01 9.620020201188493658e+00
6.523913965187499997e-01
atom= 250 15.914380998597084 2.6575812528326686
5.2023401498199107
Sending: 1.189999109458993409e+01 9.622398083668830537e+00
6.496868766581632926e+00
atom= 251 15.925180013305853 8.4795751127617489E-002
5.2023421993484060
Sending: 9.516001797608472756e+00 1.099443356384864323e+01
7.482068722341200129e+00
atom= 252 18.147873024969634 1.3805454647779496
5.2023338325242605
Sending: 9.521363640676758777e+00 1.099436174451703430e+01
1.680451781032742176e+00
Sending: 9.521406984202624102e+00 1.099436529100068149e+01
4.297563769764293440e+00
Sending: 1.025866998669414798e+01 1.228386566826211634e+01
2.971583994651764371e+00
Sending: 1.115368099859708551e+01 1.090335553242582378e+01
5.202340149819910664e+00
Sending: 1.116448001330585527e+01 8.330570030720771513e+00
5.202342199348406027e+00
Sending: 1.338717302496963768e+01 9.626319744371103937e+00
5.202333832524260515e+00
Sending: 1.278402737896498920e+01 8.314253003861532321e+00
7.379054099830940849e+00
Sending: 1.109216652964376415e+01 1.110508836209557870e+01
7.885207051660287902e-01
Sending: 1.032654594512574420e+01 9.507815609387757050e+00
7.379945316851912018e+00
Sending: 8.831808764052164307e+00 9.578649116404795549e+00
7.885247030917259536e-01
Sending: 1.026946900140291774e+01 9.711080166557064075e+00
2.971586044180265063e+00
Sending: 1.102042937297150971e+01 8.176522100688506711e+00
7.885301196183434058e-01
Sending: 7.823467798454666777e+00 1.103241626738852688e+01
7.382163504927960140e+00
Sending: 8.035976975030369118e+00 1.098811595461178214e+01
2.971592361475913435e+00
Sending: 7.141089090500260639e+00 1.236866377812769713e+01
6.218688347483974255e+00
Sending: 7.141010909499744486e+00 1.236865906065176546e+01
1.955237846516025613e+00
Sending: 9.521393015797379178e+00 1.374295754777878109e+01
3.876362424235717086e+00
Sending: 9.521341280884481506e+00 1.374290734098507016e+01
6.523913965187499997e-01
Sending: 9.520851532832354636e+00 1.374528395598669661e+01
6.498168429193992068e+00
Sending: 7.135752205372321910e+00 1.512061403636860391e+01
7.467937815653423961e+00
Sending: 7.141013640676759699e+00 1.511724888431361080e+01
1.680451781032742176e+00
Sending: 7.141056984202625912e+00 1.511725243079725800e+01
4.297563769764293440e+00
Sending: 7.878319986694148902e+00 1.640675280805869107e+01
2.971583994651764371e+00
Sending: 8.773330998597087316e+00 1.502624267222240029e+01
5.202340149819910664e+00
Sending: 8.784130013305857076e+00 1.245345717051734802e+01
5.202342199348406027e+00
Sending: 1.100682302496963771e+01 1.374920688416768044e+01
5.202333832524260515e+00
Sending: 1.040246635779289264e+01 1.243375815304947274e+01
7.378511014138936730e+00
Sending: 8.711816529643764184e+00 1.522797550189215521e+01
7.885207051660287902e-01
Sending: 7.944196067620286072e+00 1.363073238988811653e+01
7.376840927591756802e+00
Sending: 6.451458764052164341e+00 1.370153625620137205e+01
7.885247030917259536e-01
Sending: 7.889119001402919551e+00 1.383396730635364058e+01
2.971586044180265063e+00
Sending: 8.640079372971511518e+00 1.229940924048508322e+01
7.885301196183434058e-01
Sending: 5.442845882770557253e+00 1.515594640055591036e+01
7.382285524632366425e+00
Sending: 5.655626975030369152e+00 1.511100309440835865e+01
2.971592361475913435e+00
Sending: 2.438731539757682254e-01 5.399653964569083087e+00
9.216051601394250170e+00
Sending: 4.943799047278442771e+00 6.359320929803981670e+00
1.078113524315581451e+01
Sending: 3.138843358305583919e+00 3.495089033159221259e+00
9.827428649731281496e+00
Sending: 5.783276526942270124e+00 4.240744460487060330e+00
9.597502013642966290e+00
Sending: 3.616494458058998607e+00 8.325856048791498765e+00
9.585425118565524372e+00
Sending: 3.357219536482951128e+00 7.904624946018040887e+00
1.205591374980337882e+01
Sending: 6.323261420949900513e-01 3.175684482035930234e+00
1.034222361512076382e+01
Sending: 2.466428571964844885e+00 5.952425260037331967e+00
1.043216353271719576e+01
Sending: -2.305837893097886504e-01 6.188097724220211759e+00
1.046543155157912608e+01
Sending: 3.399261249766690085e+00 3.516835622123011262e+00
1.133940075852044416e+01
Sending: 5.923947421334220920e+00 4.113983946240092671e+00
1.115507609006755807e+01
Sending: *END
Geng
在 2018年12月4日星期二 UTC-8下午1:20:19,Geng Sun写道:
>
> Hello Ole,
>
> Yes, I am using MPI. I am running CP2K on cori, which is a Cray system.
> I am not very sure, but it seems that the MPI is mpich, because the
> cray-mpich module is loaded when I run CP2K.
>
> I have 331 atoms in the calculations. I tried to reduce the position string
> self._shell.send('%.18e %.18e %.18e' % tuple(pos))
> to
> self._shell.send('%.8e %.8e %.8e' % tuple(pos))
>
> I assumed this may avoid possible deadlock of the PIPE, but it does not
> work in the end.
>
> Best
>
> Geng
>
>
> 在 2018年12月4日星期二 UTC-8下午12:59:20,Ole Schütt写道:
>>
>> Hi Geng,
>>
>> are you using MPI? Then this is probably where the buffering happens.
>> Depending on which MPI implementation you are using there might be a way
>> to tweak its stdin/out forwarding.
>>
>> Out of curiosity, how many atoms does your system have? Maybe the
>> Fortran side simple tries to read too many values?
>>
>> -Ole
>>
>>
>> On 2018-12-04 21:27, Geng Sun wrote:
>> > Hello Ole,
>> >
>> > Thank you very much for your reply:
>> > I changed the code as you suggested (below), but the problem is still
>> > present in the test.
>> >
>> > Best
>> >
>> > Geng
>> >
>> > def send(self, line):
>> > """Send a line to the cp2k_shell"""
>> > assert self._child.poll() is None # child process still
>> > alive?
>> > if self._debug:
>> > #print('Sending: ' + line)
>> > sys.stderr.write("Sending: {}\n".format(line))
>> > sys.stderr.flush()
>> >
>> > if self.version < 2.1 and len(line) >= 80:
>> > raise Exception('Buffer overflow, upgrade CP2K to r16779
>> > or later')
>> > assert(len(line) < 800) # new input buffer size
>> > self.isready = False
>> > self._child.stdin.write(line + '\n')
>> > self._child.stdin.flush()
>> >
>> > 在 2018年12月4日星期二 UTC-8上午11:05:43,Ole
>> > Schütt写道:
>> >
>> >> Hi Geng,
>> >>
>> >> this sounds indeed like a buffering issue. Could you once try to add
>> >> a
>> >> flush() on the python side.
>> >>
>> >> Basically add the following line after cp2k.py:498.
>> >>
>> >> self._child.stdin.write(line + '\n')
>> >> + self._child.stdin.flush()
>> >>
>> >> This is probably quite inefficient. So, if it works I'll add some
>> >> logic
>> >> to flush only when recv() follows a send().
>> >>
>> >> -Ole
>> >>
>> >> On 2018-12-04 18:08, Geng Sun wrote:
>> >>> Dear CP2K users,
>> >>>
>> >>> In the past several weeks, I frequently faced a problem when I
>> >> use
>> >>> the CP2K -ASE interface.
>> >>>
>> >>> The calculations frequently got stuck during the calculations.
>> >>>
>> >>> 1) Firstly I switched on the debug=True option in the ASE-CP2K
>> >>> calculator and I found that the calculation always gets stuck at a
>> >>
>> >>> line with *END after sending the positions to the subroutine
>> >>> cp2k_shell.popt (I printed the information to the standard error,
>> >> so
>> >>> they are not buffered)
>> >>>
>> >>> 2) Then, I modified the cp2k_shell.F to print a lot of "labels".
>> >> Then
>> >>> I found that the code may get stuck at the line of "READ
>> >>> (*,*,iostat=iostat) pos" like below. I can always get "begin to
>> >> read
>> >>> pos" in the standard error, but I can not reach "LABEL-1".
>> >>>
>> >>> WRITE(0,*) "begin to read pos"
>> >>> CALL m_flush(0)
>> >>> READ (*,*,iostat=iostat) pos
>> >>> WRITE(0,*) "LABEL-1"
>> >>> CALL m_flush(0)
>> >>>
>> >>> I attached the modified cp2k_shell.F and the standard out/error
>> >>> generated by the slurm batch system with this post. You can see
>> >> that
>> >>> the program get stuck at the second optimization step just after
>> >>> sending the positions.
>> >>>
>> >>> I greatly appreciate your suggestions for fixing this confusing
>> >> bugs.
>> >>>
>> >>> Thanks in advance.
>> >>>
>> >>> Geng
>> >>>
>> >>> --
>> >>> You received this message because you are subscribed to the Google
>> >>
>> >>> Groups "cp2k" group.
>> >>> To unsubscribe from this group and stop receiving emails from it,
>> >> send
>> >>> an email to cp2k+... at googlegroups.com.
>> >>> To post to this group, send email to cp... at googlegroups.com.
>> >>> Visit this group at https://groups.google.com/group/cp2k [1].
>> >>> For more options, visit https://groups.google.com/d/optout [2].
>> >
>> > --
>> > You received this message because you are subscribed to the Google
>> > Groups "cp2k" group.
>> > To unsubscribe from this group and stop receiving emails from it, send
>> > an email to cp2k+... at googlegroups.com.
>> > To post to this group, send email to cp... at googlegroups.com.
>> > Visit this group at https://groups.google.com/group/cp2k.
>> > For more options, visit https://groups.google.com/d/optout.
>> >
>> >
>> > Links:
>> > ------
>> > [1] https://groups.google.com/group/cp2k
>> > [2] https://groups.google.com/d/optout
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.cp2k.org/archives/cp2k-user/attachments/20181204/177490ce/attachment.htm>
More information about the CP2K-user
mailing list