[CP2K:258] Re: accuracy/reproducability of regression tests

Shawn T. Brown shawn... at gmail.com
Tue Sep 11 18:19:33 UTC 2007
Previous message (by thread): [CP2K:256] Re: accuracy/reproducability of regression tests
Next message (by thread): LJ to semi-empirical
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
This was a problem that we had to deal with quite a lot at Q-Chem, for we
had to validate across various architectures.  The analytic QM techniques
could be made to agree to almost machine precision between different Unix
platforms, but the numerical techniques in our DFT algorithms did not
exhibit this behavior.  Actually, the standard algorithms that were used
varied even as much as in the 6 or 7th decimal.  This was consequence of the
inherent precision of the numerical grids employed, and "jacking" the grids
up to something really high would converge these down, but take forever.  We
learned to live with it and the validation scripts that we would use would
actually take arguments describing to what precision we needed.

Just though I would ad the experience to the mix.

Cheers,
Shawn

Axel,
> yep.. I will add a warning sentence in the webpage..
>
> Anyway I talked with a friend of mine at CERN and asked him how they
> do for GRID computing and for their own analysis system
> both at CERN and FERMILAB when they have to validate/porte a code
> through different platforms.
>
> The answer was simply: "we avoid to validate code on different
> platforms buying same hardware and installing same operating system
> and libraries" ;-)..
>
> Regarding GRID I discovered now that when an application is submitted
> on the GRID is bringing with itself all the necessary libraries.
> This avoids the hard work to validate codes when new nodes are added
> to the GRID (personally I find this way totally crazy ;-), but anyway
> that's it..)...
> The only work needed is the validation of the processor but this is
> done by the vendors (Intel/AMD/IBM/etc..), that guarantees that their
> processor is under quality control.
>
> He also told me that in the very few cases in which they have to do
> the porting of codes on different platforms they do it by hand
> with persons with a very good knowledge of the platform and a very
> good knowledge of the code to be ported.
> In our case we won't easily know if the error of 1E-9% is a
> miscompilation or a numerical issue associated to libraries,
> different compilers, etc..
>
> I still believe that you can use the regtest to check if you get
> larger errors ~ 1E-3 -1E-8 (and then be sure of a compilation/library
> problem) or even
> segmentation faults.
> In case not, there's a good chance that cp2k has been properly built,
> but only real applications and analyses of the results can tell if you
> have a good or a bad executable. The rule is always the same and
> should be valid with whatever code: Keep your eyes opened! (and I know
> you do it! ;-). )
>
> teo
>
> On 11 Sep 2007, at 18:29, Axel wrote:
>
> >
> > hi teo,
> >
> > thanks for the detailed answer.
> >
> > in this case i would suggest changing the text
> > on the berlios page accordingly. right now it is
> > giving the impression as if i need to reproduce
> > the values from the downloadable regression test
> > library to ensure my compilation is correct.
> >
> > cheers,
> >    axel.
> >
> > On Sep 11, 1:30 am, Teodoro Laino <teodor... at gmail.com> wrote:
> >> Hi Axel,
> >>
> >> the meaning of regtests is not to validate code between different
> >> kind of platforms.
> >> They are used by people developing the code and run always on a same
> >> machine,
> >> to check that nothing has been disrupted by new changes or
> >> introducing new pieces of code.
> >> If used in this way, also an error of 1.0E-10 makes sense (and I've
> >> seen them many times), since
> >> whatever modification should leave the numerics exactly the same.
> >>
> >> At the moment there's no a real validation test. You may make a sort
> >> of "leap of faith" and assume
> >> that cp2k has been compiled properly if regtests are OK or fail with
> >> a relative error less than 1.E-10..
> >> Again this guarantees nothing and for sure the best thing would be to
> >> go for a small number of tests,
> >> numerically stable (i.e. convergence SCF, etc..), aimed at computing
> >> some properties, but even in that
> >> case I'm not sure that numerical problems would not be an issue.
> >>
> >> As far as I know validating code through platforms is a tricky issue.
> >> A  similar problem there's also on the GRID platform.. I may try to
> >> ask to these guys
> >> how they validate codes when new machines are added to the grid
> >> infrastructure. I'm sure there's some
> >> level  of "human-intervention" to decide if a code has been compiled
> >> properly on a new architecture.
> >>
> >> Teo
> >>
> >> On 11 Sep 2007, at 04:12, Axel wrote:
> >>
> >>
> >>
> >>> hi everybody!
> >>
> >>> i'm currently in the process of trying to automate
> >>> building cp2k on a number of platforms so that i
> >>> have a way to supply people with up-to-date executables
> >>> and a list of tested features.
> >>
> >>> however it seems that quite a number of the test results
> >>> could only be reproduced on the exact same machine
> >>> with the exact same compiler/library parallel etc. settings.
> >>
> >>> or more precisely asked: why do i get a 'WRONG' result with
> >>> a relative error of less than 1.e-10 when you have only 14 digits
> >>> absolute precision in a real*8 floating point number to begin with?
> >>> especially, when the SCF convergence of an input is not set very
> >>> tightly. just moving to a different platform, using a different
> >>> compiler,
> >>> a different optimization level, a different BLAS/LAPACK or running
> >>> a serial instead of parallel executable can induce changes of that
> >>> magnitude while still being accurate within the boundaries of
> >>> floating
> >>> point arithmetik, considering how many FLOPS are involved into
> >>> computing the properties the regression tester is comparing.
> >>
> >>> what about tests of properties, that simply cannot be computed
> >>> to that high accuracy at all?
> >>
> >>> do i have to make a 'leap of faith' and say that machine X is
> >>> executing
> >>> cp2k correctly when all regtests are flagged ok on berlios and then
> >>> use that output as my internal standard for that machine?
> >>
> >>> cheers,
> >>>    axel.
> >
> >
> > >
>
>
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.cp2k.org/archives/cp2k-user/attachments/20070911/1a163509/attachment.htm>
Previous message (by thread): [CP2K:256] Re: accuracy/reproducability of regression tests
Next message (by thread): LJ to semi-empirical
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the CP2K-user mailing list