CI-NEB calculation: crashes

Matt W mattwa... at gmail.com
Tue Feb 28 11:17:50 UTC 2017


Hi Jörg,

to me this error message

Trying to move ./WFN_restart.wfn.bak-1 to ./WFN_restart.wfn.bak-2. 
 rename returned status:           -1 

looks suspicious. I would expect the wavefunction files to have some 
prefixes indicating which replica etc. Maybe several MPI processes are 
trying to get a file lock on the same file?

Have you changed the names of any restart files / output file names etc in 
you input file?

Matt

On Monday, February 27, 2017 at 10:23:09 PM UTC, sassy wrote:
>
> Dear all, 
>
> I am trying to do a CI-NEB calculation but after the first step the 
> calculation 
> crashed which this error message: 
>
>  NEB| Building initial set of coordinates. END 
>
>  ******************************************************************************* 
>
>  BAND TYPE                     =                                         
>  CI- 
> NEB 
>  BAND TYPE OPTIMIZATION        =                                           
>     
> SD 
>  STEP NUMBER                   =                                           
>     
> 0 
>  RMSD DISTANCE DEFINITION      =                                           
>     
> T 
>  NUMBER OF NEB REPLICA         =                                           
>     
> 5 
>  DISTANCES REP =        9.750661        9.750661        9.750661         
> 9.750661 
>  ENERGIES [au] =     -648.476382     -647.620195     -646.701277     
> -647.623017 
>                      -648.424927 
>  BAND TOTAL ENERGY [au]        =                             
> -3238.84579812863058 
>  ******************************************************************************* 
>
>  Trying to move ./WFN_restart.wfn.bak-1 to ./WFN_restart.wfn.bak-2. 
>  rename returned status:           -1 
>  Problem moving file 
> ------------------------------------------------------- 
> Primary job  terminated normally, but 1 process returned 
> a non-zero exit code.. Per user-direction, the job has been aborted. 
> ------------------------------------------------------- 
> -------------------------------------------------------------------------- 
> mpirun detected that one or more processes exited with non-zero status, 
> thus 
> causing 
> the job to be terminated. The first process to do so was: 
>
>   Process name: [[44988,1],192] 
>   Exit code:    1 
>
>
> The SGE error files contains this: 
>
> cp2k-4.1-avx2.popt:3555 terminated with signal 6 at PC=2ad95d2c35f7 
> SP=7ffe9c0dbcf8. 
> (I have omitted the backtrace) 
>
> I am using 256 cores and this is the relevant part of my input file: 
>
> @SET BAND_TYPE NEB 
> &MOTION 
>   &PRINT 
>     &VELOCITIES OFF 
>     &END 
>   &END 
>   &BAND 
>     NPROC_REP 32 
> @IF ( ${BAND_TYPE} == NEB ) 
>     BAND_TYPE CI-NEB 
>     K_SPRING 0.2 
>     ROTATE_FRAMES T 
>     &CI_NEB 
>        NSTEPS_IT  5 
>     &END 
> @ENDIF 
> @ENDIF 
>     NUMBER_OF_REPLICA 5 
>     &CONVERGENCE_CONTROL 
>       MAX_FORCE 0.001 
>       RMS_FORCE 0.0005 
>     &END 
>     &OPTIMIZE_BAND 
>       OPTIMIZE_END_POINTS F 
>       OPT_TYPE DIIS 
>       &DIIS 
>        MAX_STEPS 200 
>        N_DIIS 7 
>        NO_LS 
>        STEPSIZE 0.5 
>        MAX_STEPSIZE 1.0 
>       &END 
>     &END 
>     &REPLICA 
>       COORD_FILE_NAME files/start-A.xyz 
>     &END 
>     &REPLICA 
>       COORD_FILE_NAME files/final-C.xyz   
>     &END 
>     &PROGRAM_RUN_INFO 
>     &END 
>     &CONVERGENCE_INFO 
>     &END 
>   &END BAND 
> &END MOTION 
>
>
> Could anybody point me in the right direction here? I am trying to get 
> these 
> calculations done for some time now and I am still stuck. I have checked 
> the 
> cluster with a different input file which I know works and so I got some 
> confidence it is not a cluster problem. 
> Anybody any ideas? 
>
> Please let me know if you need more informations. 
>
> All the best from London 
>
> Jörg 
>
>
> -- 
> ************************************************************* 
> Dr. Jörg Saßmannshausen, MRSC 
> University College London 
> Department of Chemistry 
> 20 Gordon Street 
> London 
> WC1H 0AJ 
>
> email: j.sas... at ucl.ac.uk <javascript:> 
> web: http://sassy.formativ.net 
>
> Please avoid sending me Word or PowerPoint attachments. 
> See http://www.gnu.org/philosophy/no-word-attachments.html 
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.cp2k.org/archives/cp2k-user/attachments/20170228/f350957b/attachment.htm>


More information about the CP2K-user mailing list