[CP2K:256] Re: accuracy/reproducability of regression tests

Teodoro Laino teodor... at gmail.com
Tue Sep 11 19:04:52 CEST 2007

hi Axel,
yep.. I will add a warning sentence in the webpage..

Anyway I talked with a friend of mine at CERN and asked him how they  
do for GRID computing and for their own analysis system
both at CERN and FERMILAB when they have to validate/porte a code  
through different platforms.

The answer was simply: "we avoid to validate code on different  
platforms buying same hardware and installing same operating system  
and libraries" ;-)..

Regarding GRID I discovered now that when an application is submitted  
on the GRID is bringing with itself all the necessary libraries.
This avoids the hard work to validate codes when new nodes are added  
to the GRID (personally I find this way totally crazy ;-), but anyway  
that's it..)...
The only work needed is the validation of the processor but this is  
done by the vendors (Intel/AMD/IBM/etc..), that guarantees that their
processor is under quality control.

He also told me that in the very few cases in which they have to do  
the porting of codes on different platforms they do it by hand
with persons with a very good knowledge of the platform and a very  
good knowledge of the code to be ported.
In our case we won't easily know if the error of 1E-9% is a  
miscompilation or a numerical issue associated to libraries,  
different compilers, etc..

I still believe that you can use the regtest to check if you get  
larger errors ~ 1E-3 -1E-8 (and then be sure of a compilation/library  
problem) or even
segmentation faults.
In case not, there's a good chance that cp2k has been properly built,  
but only real applications and analyses of the results can tell if you
have a good or a bad executable. The rule is always the same and  
should be valid with whatever code: Keep your eyes opened! (and I know
you do it! ;-). )


On 11 Sep 2007, at 18:29, Axel wrote:

> hi teo,
> thanks for the detailed answer.
> in this case i would suggest changing the text
> on the berlios page accordingly. right now it is
> giving the impression as if i need to reproduce
> the values from the downloadable regression test
> library to ensure my compilation is correct.
> cheers,
>    axel.
> On Sep 11, 1:30 am, Teodoro Laino <teodor... at gmail.com> wrote:
>> Hi Axel,
>> the meaning of regtests is not to validate code between different
>> kind of platforms.
>> They are used by people developing the code and run always on a same
>> machine,
>> to check that nothing has been disrupted by new changes or
>> introducing new pieces of code.
>> If used in this way, also an error of 1.0E-10 makes sense (and I've
>> seen them many times), since
>> whatever modification should leave the numerics exactly the same.
>> At the moment there's no a real validation test. You may make a sort
>> of "leap of faith" and assume
>> that cp2k has been compiled properly if regtests are OK or fail with
>> a relative error less than 1.E-10..
>> Again this guarantees nothing and for sure the best thing would be to
>> go for a small number of tests,
>> numerically stable (i.e. convergence SCF, etc..), aimed at computing
>> some properties, but even in that
>> case I'm not sure that numerical problems would not be an issue.
>> As far as I know validating code through platforms is a tricky issue.
>> A  similar problem there's also on the GRID platform.. I may try to
>> ask to these guys
>> how they validate codes when new machines are added to the grid
>> infrastructure. I'm sure there's some
>> level  of "human-intervention" to decide if a code has been compiled
>> properly on a new architecture.
>> Teo
>> On 11 Sep 2007, at 04:12, Axel wrote:
>>> hi everybody!
>>> i'm currently in the process of trying to automate
>>> building cp2k on a number of platforms so that i
>>> have a way to supply people with up-to-date executables
>>> and a list of tested features.
>>> however it seems that quite a number of the test results
>>> could only be reproduced on the exact same machine
>>> with the exact same compiler/library parallel etc. settings.
>>> or more precisely asked: why do i get a 'WRONG' result with
>>> a relative error of less than 1.e-10 when you have only 14 digits
>>> absolute precision in a real*8 floating point number to begin with?
>>> especially, when the SCF convergence of an input is not set very
>>> tightly. just moving to a different platform, using a different
>>> compiler,
>>> a different optimization level, a different BLAS/LAPACK or running
>>> a serial instead of parallel executable can induce changes of that
>>> magnitude while still being accurate within the boundaries of
>>> floating
>>> point arithmetik, considering how many FLOPS are involved into
>>> computing the properties the regression tester is comparing.
>>> what about tests of properties, that simply cannot be computed
>>> to that high accuracy at all?
>>> do i have to make a 'leap of faith' and say that machine X is
>>> executing
>>> cp2k correctly when all regtests are flagged ok on berlios and then
>>> use that output as my internal standard for that machine?
>>> cheers,
>>>    axel.
> >

More information about the CP2K-user mailing list