[CP2K:2213] Re: determinism of CP2K runs

Laino Teodoro teodor... at gmail.com
Fri Aug 7 20:46:40 UTC 2009


Hi Noam,

I perfectly understand your point (being deterministic is a feature  
which is helpful in many situations).
It could be that there are uninitialized things in parallel.
If machine/library  have been tested and they guarantee the same  
number up to the machine precision with other
codes, independently of the computing cores, than very probably it  
could be a bug in parallel (or in some library (ScaLAPACK more  
reasonably) routines which are not used by these other codes).

Anyway.. as you said.. let's proceed step by step and first try to  
see if you manage to reproduce the same error in serial..

A presto,
Teo

On 7 Aug 2009, at 22:36, Noam Bernstein wrote:

>
> Hi Teo - I thought of all the things you mentioned, but I doubt that
> they are the cause.  I'll explain why briefly now (and also why
> I need it to be deterministic, unfortunately :), and I'll have a more
> complete explanation and hopefully a better (smaller, maybe usable
> in serial) test case.
>
> First the reason I need it to be deterministic:  I'm running MD, and
> it's chaotic (in the technical sense), so unless I have deterministic
> runs I can't reproduce a trajectory (for example with different output
> options), even from the same input file.
>
> As to why I doubt that it's differences in machines, I've run many
> electronic structure codes on this cluster, and I've never seen
> non-determinism (for a fixed _number_ of processes) except for uses
> of random variables or uninitialized variables (i.e. inadvertently
> random variables).  To me this means that the machines are
> reasonably reliable, and that the MPI implementation is deterministic
> (although since floating-point math isn't associative, changing
> the _number_ of processes does change the answer at the level
> of roundoff).  I also don't think it's a memory error/cosmic ray
> for the same reason - CP2K is never the same twice (for my test
> input), while other (MPI+ScaLAPACK electronic structure) codes
> always are.
>
> Anyway, I'll work on reproducing the issue either in serial or at  
> least
> in parallel on the same set of nodes, so feel free to ignore me  
> until then.
>
>      thanks,
>      Noam
>
> >




More information about the CP2K-user mailing list