CI-NEB calculation: crashes
Matt W
mattwa... at gmail.com
Tue Feb 28 11:17:50 UTC 2017
Hi Jörg,
to me this error message
Trying to move ./WFN_restart.wfn.bak-1 to ./WFN_restart.wfn.bak-2.
rename returned status: -1
looks suspicious. I would expect the wavefunction files to have some
prefixes indicating which replica etc. Maybe several MPI processes are
trying to get a file lock on the same file?
Have you changed the names of any restart files / output file names etc in
you input file?
Matt
On Monday, February 27, 2017 at 10:23:09 PM UTC, sassy wrote:
>
> Dear all,
>
> I am trying to do a CI-NEB calculation but after the first step the
> calculation
> crashed which this error message:
>
> NEB| Building initial set of coordinates. END
>
> *******************************************************************************
>
> BAND TYPE =
> CI-
> NEB
> BAND TYPE OPTIMIZATION =
>
> SD
> STEP NUMBER =
>
> 0
> RMSD DISTANCE DEFINITION =
>
> T
> NUMBER OF NEB REPLICA =
>
> 5
> DISTANCES REP = 9.750661 9.750661 9.750661
> 9.750661
> ENERGIES [au] = -648.476382 -647.620195 -646.701277
> -647.623017
> -648.424927
> BAND TOTAL ENERGY [au] =
> -3238.84579812863058
> *******************************************************************************
>
> Trying to move ./WFN_restart.wfn.bak-1 to ./WFN_restart.wfn.bak-2.
> rename returned status: -1
> Problem moving file
> -------------------------------------------------------
> Primary job terminated normally, but 1 process returned
> a non-zero exit code.. Per user-direction, the job has been aborted.
> -------------------------------------------------------
> --------------------------------------------------------------------------
> mpirun detected that one or more processes exited with non-zero status,
> thus
> causing
> the job to be terminated. The first process to do so was:
>
> Process name: [[44988,1],192]
> Exit code: 1
>
>
> The SGE error files contains this:
>
> cp2k-4.1-avx2.popt:3555 terminated with signal 6 at PC=2ad95d2c35f7
> SP=7ffe9c0dbcf8.
> (I have omitted the backtrace)
>
> I am using 256 cores and this is the relevant part of my input file:
>
> @SET BAND_TYPE NEB
> &MOTION
> &PRINT
> &VELOCITIES OFF
> &END
> &END
> &BAND
> NPROC_REP 32
> @IF ( ${BAND_TYPE} == NEB )
> BAND_TYPE CI-NEB
> K_SPRING 0.2
> ROTATE_FRAMES T
> &CI_NEB
> NSTEPS_IT 5
> &END
> @ENDIF
> @ENDIF
> NUMBER_OF_REPLICA 5
> &CONVERGENCE_CONTROL
> MAX_FORCE 0.001
> RMS_FORCE 0.0005
> &END
> &OPTIMIZE_BAND
> OPTIMIZE_END_POINTS F
> OPT_TYPE DIIS
> &DIIS
> MAX_STEPS 200
> N_DIIS 7
> NO_LS
> STEPSIZE 0.5
> MAX_STEPSIZE 1.0
> &END
> &END
> &REPLICA
> COORD_FILE_NAME files/start-A.xyz
> &END
> &REPLICA
> COORD_FILE_NAME files/final-C.xyz
> &END
> &PROGRAM_RUN_INFO
> &END
> &CONVERGENCE_INFO
> &END
> &END BAND
> &END MOTION
>
>
> Could anybody point me in the right direction here? I am trying to get
> these
> calculations done for some time now and I am still stuck. I have checked
> the
> cluster with a different input file which I know works and so I got some
> confidence it is not a cluster problem.
> Anybody any ideas?
>
> Please let me know if you need more informations.
>
> All the best from London
>
> Jörg
>
>
> --
> *************************************************************
> Dr. Jörg Saßmannshausen, MRSC
> University College London
> Department of Chemistry
> 20 Gordon Street
> London
> WC1H 0AJ
>
> email: j.sas... at ucl.ac.uk <javascript:>
> web: http://sassy.formativ.net
>
> Please avoid sending me Word or PowerPoint attachments.
> See http://www.gnu.org/philosophy/no-word-attachments.html
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.cp2k.org/archives/cp2k-user/attachments/20170228/f350957b/attachment.htm>
More information about the CP2K-user
mailing list