[CP2K:249] accuracy/reproducability of regression tests
Teodoro Laino
teodor... at gmail.com
Tue Sep 11 05:30:25 UTC 2007
Hi Axel,
the meaning of regtests is not to validate code between different
kind of platforms.
They are used by people developing the code and run always on a same
machine,
to check that nothing has been disrupted by new changes or
introducing new pieces of code.
If used in this way, also an error of 1.0E-10 makes sense (and I've
seen them many times), since
whatever modification should leave the numerics exactly the same.
At the moment there's no a real validation test. You may make a sort
of "leap of faith" and assume
that cp2k has been compiled properly if regtests are OK or fail with
a relative error less than 1.E-10..
Again this guarantees nothing and for sure the best thing would be to
go for a small number of tests,
numerically stable (i.e. convergence SCF, etc..), aimed at computing
some properties, but even in that
case I'm not sure that numerical problems would not be an issue.
As far as I know validating code through platforms is a tricky issue.
A similar problem there's also on the GRID platform.. I may try to
ask to these guys
how they validate codes when new machines are added to the grid
infrastructure. I'm sure there's some
level of "human-intervention" to decide if a code has been compiled
properly on a new architecture.
Teo
On 11 Sep 2007, at 04:12, Axel wrote:
>
> hi everybody!
>
> i'm currently in the process of trying to automate
> building cp2k on a number of platforms so that i
> have a way to supply people with up-to-date executables
> and a list of tested features.
>
> however it seems that quite a number of the test results
> could only be reproduced on the exact same machine
> with the exact same compiler/library parallel etc. settings.
>
> or more precisely asked: why do i get a 'WRONG' result with
> a relative error of less than 1.e-10 when you have only 14 digits
> absolute precision in a real*8 floating point number to begin with?
> especially, when the SCF convergence of an input is not set very
> tightly. just moving to a different platform, using a different
> compiler,
> a different optimization level, a different BLAS/LAPACK or running
> a serial instead of parallel executable can induce changes of that
> magnitude while still being accurate within the boundaries of
> floating
> point arithmetik, considering how many FLOPS are involved into
> computing the properties the regression tester is comparing.
>
> what about tests of properties, that simply cannot be computed
> to that high accuracy at all?
>
> do i have to make a 'leap of faith' and say that machine X is
> executing
> cp2k correctly when all regtests are flagged ok on berlios and then
> use that output as my internal standard for that machine?
>
> cheers,
> axel.
>
>
> >
More information about the CP2K-user
mailing list