[CP2K:8750] Re: CI-NEB calculation: crashes
Jörg Saßmannshausen
j.sassma... at ucl.ac.uk
Tue Feb 28 12:09:17 UTC 2017
Hi Matt,
thanks for the feedback.
I think that error message is a bit of a red herring. I am running normal
geometry and hessian calculations for some time now and my wavefunction file is
always called WFN_restart.wfn in the input file.
Originally I suspected it is a problem with the cluster but given that I could
repeat the problem with that calculation and not with a different calculation I
think that is not the problem.
It is running now. All I done was removing the duplicated line
@ENDIF
in my input file. I don't really know why I had it twice to be honest and it
does not make much sense to me that for a SM type of band calculations that
did not cause any problems whereas it does for a CI-NEB calculation. I would
have thought if there is a problem with the input file, the program crashes
right at the beginning and not after the first step.
So for now I think we can close that, problem sorted.
Thanks for the feedback though.
All the best from a sunny London
Jörg
On Tuesday 28 Feb 2017 03:17:50 Matt W wrote:
> Hi Jörg,
>
> to me this error message
>
> Trying to move ./WFN_restart.wfn.bak-1 to ./WFN_restart.wfn.bak-2.
> rename returned status: -1
>
> looks suspicious. I would expect the wavefunction files to have some
> prefixes indicating which replica etc. Maybe several MPI processes are
> trying to get a file lock on the same file?
>
> Have you changed the names of any restart files / output file names etc in
> you input file?
>
> Matt
>
> On Monday, February 27, 2017 at 10:23:09 PM UTC, sassy wrote:
> > Dear all,
> >
> > I am trying to do a CI-NEB calculation but after the first step the
> > calculation
> >
> > crashed which this error message:
> > NEB| Building initial set of coordinates. END
> >
> > *************************************************************************
> > ******
> >
> > BAND TYPE =
> > CI-
> >
> > NEB
> >
> > BAND TYPE OPTIMIZATION =
> >
> > SD
> >
> > STEP NUMBER =
> >
> > 0
> >
> > RMSD DISTANCE DEFINITION =
> >
> > T
> >
> > NUMBER OF NEB REPLICA =
> >
> > 5
> >
> > DISTANCES REP = 9.750661 9.750661 9.750661
> >
> > 9.750661
> >
> > ENERGIES [au] = -648.476382 -647.620195 -646.701277
> >
> > -647.623017
> >
> > -648.424927
> >
> > BAND TOTAL ENERGY [au] =
> >
> > -3238.84579812863058
> >
> > *************************************************************************
> > ******
> >
> > Trying to move ./WFN_restart.wfn.bak-1 to ./WFN_restart.wfn.bak-2.
> > rename returned status: -1
> > Problem moving file
> >
> > -------------------------------------------------------
> > Primary job terminated normally, but 1 process returned
> > a non-zero exit code.. Per user-direction, the job has been aborted.
> > -------------------------------------------------------
> > --------------------------------------------------------------------------
> > mpirun detected that one or more processes exited with non-zero status,
> > thus
> > causing
> >
> > the job to be terminated. The first process to do so was:
> > Process name: [[44988,1],192]
> > Exit code: 1
> >
> > The SGE error files contains this:
> >
> > cp2k-4.1-avx2.popt:3555 terminated with signal 6 at PC=2ad95d2c35f7
> > SP=7ffe9c0dbcf8.
> > (I have omitted the backtrace)
> >
> > I am using 256 cores and this is the relevant part of my input file:
> >
> > @SET BAND_TYPE NEB
> > &MOTION
> >
> > &PRINT
> >
> > &VELOCITIES OFF
> > &END
> >
> > &END
> > &BAND
> >
> > NPROC_REP 32
> >
> > @IF ( ${BAND_TYPE} == NEB )
> >
> > BAND_TYPE CI-NEB
> > K_SPRING 0.2
> > ROTATE_FRAMES T
> > &CI_NEB
> >
> > NSTEPS_IT 5
> >
> > &END
> >
> > @ENDIF
> > @ENDIF
> >
> > NUMBER_OF_REPLICA 5
> > &CONVERGENCE_CONTROL
> >
> > MAX_FORCE 0.001
> > RMS_FORCE 0.0005
> >
> > &END
> > &OPTIMIZE_BAND
> >
> > OPTIMIZE_END_POINTS F
> > OPT_TYPE DIIS
> > &DIIS
> >
> > MAX_STEPS 200
> > N_DIIS 7
> > NO_LS
> > STEPSIZE 0.5
> > MAX_STEPSIZE 1.0
> >
> > &END
> >
> > &END
> > &REPLICA
> >
> > COORD_FILE_NAME files/start-A.xyz
> >
> > &END
> > &REPLICA
> >
> > COORD_FILE_NAME files/final-C.xyz
> >
> > &END
> > &PROGRAM_RUN_INFO
> > &END
> > &CONVERGENCE_INFO
> > &END
> >
> > &END BAND
> >
> > &END MOTION
> >
> >
> > Could anybody point me in the right direction here? I am trying to get
> > these
> > calculations done for some time now and I am still stuck. I have checked
> > the
> > cluster with a different input file which I know works and so I got some
> > confidence it is not a cluster problem.
> > Anybody any ideas?
> >
> > Please let me know if you need more informations.
> >
> > All the best from London
> >
> > Jörg
> >
> >
> >
> > email: j.sas... at ucl.ac.uk <javascript:>
> > web: http://sassy.formativ.net
> >
> > Please avoid sending me Word or PowerPoint attachments.
> > See http://www.gnu.org/philosophy/no-word-attachments.html
--
*************************************************************
Dr. Jörg Saßmannshausen, MRSC
University College London
Department of Chemistry
20 Gordon Street
London
WC1H 0AJ
email: j.sassma... at ucl.ac.uk
web: http://sassy.formativ.net
Please avoid sending me Word or PowerPoint attachments.
See http://www.gnu.org/philosophy/no-word-attachments.html
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 220 bytes
Desc: This is a digitally signed message part.
URL: <https://lists.cp2k.org/archives/cp2k-user/attachments/20170228/6d1e1630/attachment.sig>
More information about the CP2K-user
mailing list