accuracy/reproducability of regression tests

Axel akoh... at gmail.com
Tue Sep 11 16:29:16 UTC 2007


hi teo,

thanks for the detailed answer.

in this case i would suggest changing the text
on the berlios page accordingly. right now it is
giving the impression as if i need to reproduce
the values from the downloadable regression test
library to ensure my compilation is correct.

cheers,
   axel.

On Sep 11, 1:30 am, Teodoro Laino <teodor... at gmail.com> wrote:
> Hi Axel,
>
> the meaning of regtests is not to validate code between different
> kind of platforms.
> They are used by people developing the code and run always on a same
> machine,
> to check that nothing has been disrupted by new changes or
> introducing new pieces of code.
> If used in this way, also an error of 1.0E-10 makes sense (and I've
> seen them many times), since
> whatever modification should leave the numerics exactly the same.
>
> At the moment there's no a real validation test. You may make a sort
> of "leap of faith" and assume
> that cp2k has been compiled properly if regtests are OK or fail with
> a relative error less than 1.E-10..
> Again this guarantees nothing and for sure the best thing would be to
> go for a small number of tests,
> numerically stable (i.e. convergence SCF, etc..), aimed at computing
> some properties, but even in that
> case I'm not sure that numerical problems would not be an issue.
>
> As far as I know validating code through platforms is a tricky issue.
> A  similar problem there's also on the GRID platform.. I may try to
> ask to these guys
> how they validate codes when new machines are added to the grid
> infrastructure. I'm sure there's some
> level  of "human-intervention" to decide if a code has been compiled
> properly on a new architecture.
>
> Teo
>
> On 11 Sep 2007, at 04:12, Axel wrote:
>
>
>
> > hi everybody!
>
> > i'm currently in the process of trying to automate
> > building cp2k on a number of platforms so that i
> > have a way to supply people with up-to-date executables
> > and a list of tested features.
>
> > however it seems that quite a number of the test results
> > could only be reproduced on the exact same machine
> > with the exact same compiler/library parallel etc. settings.
>
> > or more precisely asked: why do i get a 'WRONG' result with
> > a relative error of less than 1.e-10 when you have only 14 digits
> > absolute precision in a real*8 floating point number to begin with?
> > especially, when the SCF convergence of an input is not set very
> > tightly. just moving to a different platform, using a different
> > compiler,
> > a different optimization level, a different BLAS/LAPACK or running
> > a serial instead of parallel executable can induce changes of that
> > magnitude while still being accurate within the boundaries of
> > floating
> > point arithmetik, considering how many FLOPS are involved into
> > computing the properties the regression tester is comparing.
>
> > what about tests of properties, that simply cannot be computed
> > to that high accuracy at all?
>
> > do i have to make a 'leap of faith' and say that machine X is
> > executing
> > cp2k correctly when all regtests are flagged ok on berlios and then
> > use that output as my internal standard for that machine?
>
> > cheers,
> >    axel.




More information about the CP2K-user mailing list