<div dir="ltr">Hi Jörg,<div><br></div><div>to me this error message</div><div><br></div><div>Trying to move ./WFN_restart.wfn.bak-1 to ./WFN_restart.wfn.bak-2. <br> rename returned status:           -1 <br></div><div><br></div><div>looks suspicious. I would expect the wavefunction files to have some prefixes indicating which replica etc. Maybe several MPI processes are trying to get a file lock on the same file?</div><div><br></div><div>Have you changed the names of any restart files / output file names etc in you input file?</div><div><br></div><div>Matt<br><br>On Monday, February 27, 2017 at 10:23:09 PM UTC, sassy wrote:<blockquote class="gmail_quote" style="margin: 0;margin-left: 0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;">Dear all,
<br>
<br>I am trying to do a CI-NEB calculation but after the first step the calculation 
<br>crashed which this error message:
<br>
<br> NEB| Building initial set of coordinates. END
<br>
<br> *****************************<wbr>******************************<wbr>********************
<br> BAND TYPE                     =                                          CI-
<br>NEB
<br> BAND TYPE OPTIMIZATION        =                                              
<br>SD
<br> STEP NUMBER                   =                                               
<br>0
<br> RMSD DISTANCE DEFINITION      =                                               
<br>T
<br> NUMBER OF NEB REPLICA         =                                               
<br>5
<br> DISTANCES REP =        9.750661        9.750661        9.750661        
<br>9.750661
<br> ENERGIES [au] =     -648.476382     -647.620195     -646.701277     
<br>-647.623017
<br>                     -648.424927
<br> BAND TOTAL ENERGY [au]        =                            
<br>-3238.84579812863058
<br> *****************************<wbr>******************************<wbr>********************
<br> Trying to move ./WFN_restart.wfn.bak-1 to ./WFN_restart.wfn.bak-2.
<br> rename returned status:           -1
<br> Problem moving file
<br>------------------------------<wbr>-------------------------
<br>Primary job  terminated normally, but 1 process returned
<br>a non-zero exit code.. Per user-direction, the job has been aborted.
<br>------------------------------<wbr>-------------------------
<br>------------------------------<wbr>------------------------------<wbr>--------------
<br>mpirun detected that one or more processes exited with non-zero status, thus 
<br>causing
<br>the job to be terminated. The first process to do so was:
<br>
<br>  Process name: [[44988,1],192]
<br>  Exit code:    1
<br>
<br>
<br>The SGE error files contains this:
<br>
<br>cp2k-4.1-avx2.popt:3555 terminated with signal 6 at PC=2ad95d2c35f7 
<br>SP=7ffe9c0dbcf8.
<br>(I have omitted the backtrace)
<br>
<br>I am using 256 cores and this is the relevant part of my input file:
<br>
<br>@SET BAND_TYPE NEB
<br>&MOTION
<br>  &PRINT
<br>    &VELOCITIES OFF
<br>    &END
<br>  &END
<br>  &BAND
<br>    NPROC_REP 32 
<br>@IF ( ${BAND_TYPE} == NEB )
<br>    BAND_TYPE CI-NEB
<br>    K_SPRING 0.2
<br>    ROTATE_FRAMES T
<br>    &CI_NEB
<br>       NSTEPS_IT  5
<br>    &END
<br>@ENDIF
<br>@ENDIF
<br>    NUMBER_OF_REPLICA 5 
<br>    &CONVERGENCE_CONTROL
<br>      MAX_FORCE 0.001
<br>      RMS_FORCE 0.0005
<br>    &END
<br>    &OPTIMIZE_BAND
<br>      OPTIMIZE_END_POINTS F
<br>      OPT_TYPE DIIS
<br>      &DIIS
<br>       MAX_STEPS 200
<br>       N_DIIS 7
<br>       NO_LS
<br>       STEPSIZE 0.5
<br>       MAX_STEPSIZE 1.0
<br>      &END
<br>    &END
<br>    &REPLICA
<br>      COORD_FILE_NAME files/start-A.xyz 
<br>    &END
<br>    &REPLICA
<br>      COORD_FILE_NAME files/final-C.xyz  
<br>    &END
<br>    &PROGRAM_RUN_INFO 
<br>    &END
<br>    &CONVERGENCE_INFO
<br>    &END
<br>  &END BAND
<br>&END MOTION
<br>
<br>
<br>Could anybody point me in the right direction here? I am trying to get these 
<br>calculations done for some time now and I am still stuck. I have checked the 
<br>cluster with a different input file which I know works and so I got some 
<br>confidence it is not a cluster problem.
<br>Anybody any ideas?
<br>
<br>Please let me know if you need more informations. 
<br>
<br>All the best from London
<br>
<br>Jörg
<br>
<br>
<br>-- 
<br>******************************<wbr>******************************<wbr>*
<br>Dr. Jörg Saßmannshausen, MRSC
<br>University College London
<br>Department of Chemistry
<br>20 Gordon Street
<br>London
<br>WC1H 0AJ 
<br>
<br>email: <a href="javascript:" target="_blank" gdf-obfuscated-mailto="AGVvriP5AwAJ" rel="nofollow" onmousedown="this.href='javascript:';return true;" onclick="this.href='javascript:';return true;">j.sas...@ucl.ac.uk</a>
<br>web: <a href="http://sassy.formativ.net" target="_blank" rel="nofollow" onmousedown="this.href='http://www.google.com/url?q\x3dhttp%3A%2F%2Fsassy.formativ.net\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNE9o9WkpTCH9I54nkUfQDVTZ3MyXg';return true;" onclick="this.href='http://www.google.com/url?q\x3dhttp%3A%2F%2Fsassy.formativ.net\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNE9o9WkpTCH9I54nkUfQDVTZ3MyXg';return true;">http://sassy.formativ.net</a>
<br>
<br>Please avoid sending me Word or PowerPoint attachments.
<br>See <a href="http://www.gnu.org/philosophy/no-word-attachments.html" target="_blank" rel="nofollow" onmousedown="this.href='http://www.google.com/url?q\x3dhttp%3A%2F%2Fwww.gnu.org%2Fphilosophy%2Fno-word-attachments.html\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEjON8UjWwcCChP9-0mlwmJs_iU7g';return true;" onclick="this.href='http://www.google.com/url?q\x3dhttp%3A%2F%2Fwww.gnu.org%2Fphilosophy%2Fno-word-attachments.html\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEjON8UjWwcCChP9-0mlwmJs_iU7g';return true;">http://www.gnu.org/philosophy/<wbr>no-word-attachments.html</a>
<br></blockquote></div></div>