FIST bug?

Matt W mattwa... at gmail.com
Wed Apr 20 11:37:30 UTC 2011


Ok change lines 164 and 167 of pme.F to

    IF ( rden % desc % parallel .AND. rden % desc % distributed ) THEN

and

    IF ( PRESENT(shell_particle_set) .AND. rden % desc %
parallel .AND. rden % desc % distributed ) THEN

respectively. It was using an outdated and risky check to see if the
RS grids were distributed.

Cheers,

Matt

On Apr 20, 12:01 pm, Matt W <mattwa... at gmail.com> wrote:
> Hi Guys,
>
> I've just had a very quick look and I think it is a PME bug. Not SPME,
> not QS. Particularly, the QS RS grids are pretty well tested.
>
> If I change the EWALD section to
>
>       &EWALD
>         EWALD_TYPE spme
>         ALPHA .44
>         NS_MAX 25
>         GMAX 64 64 64
>       &END EWALD
>
> then there is no problem with 8,16 32 procs. All give the same energy
>
> out_spme_8.out: ENERGY| Total FORCE_EVAL ( FIST ) energy (a.u.):
> -0.000148559882818
> out_spme_16.out: ENERGY| Total FORCE_EVAL ( FIST ) energy
> (a.u.):       -0.000148559882818
> out_spme_32.out: ENERGY| Total FORCE_EVAL ( FIST ) energy
> (a.u.):       -0.000148559882818
>
> I believe that the PME uses a second smaller grid for some purposes -
> I would guess that there is some inconsistency in the treatment of the
> larger and smaller grids.  I'll try and have a proper look but people
> who know the PME code might spot it more quickly.
>
> In summary: there is a bug with PME and at least some distributed
> grids. But no evidence for problems with other methods.
>
> Cheers,
>
> Matt
>
> (previous mail to the group got bounced so reposting via google gui)
>
> On Apr 20, 6:00 am, Teodoro Laino <teodor... at gmail.com> wrote:
>
> > Hi Noam,
>
> > I can confirm your bug report. The issue is related to the relatively new distribution RS_GRID (a couple of years old..).
> > This means that the same issue could be present also in QS jobs (they share the same infrastructure).
>
> > Matt who worked on this stuff is in CC. The issue can be reproduced with this simple input file:
>
> > &FORCE_EVAL
> >   METHOD FIST
> >   &MM
> >     &FORCEFIELD
> >       parm_file_name water.pot  ! (just use the one in tests/Fist/sample_pot/)
> >       parmtype CHM
> >       &CHARGE
> >         ATOM OT
> >         CHARGE -0.8476
> >       &END CHARGE
> >       &CHARGE
> >         ATOM HT
> >         CHARGE 0.4238
> >       &END CHARGE
> >     &END FORCEFIELD
> >     &POISSON
> >       &EWALD
> >         EWALD_TYPE pme
> >         ALPHA .44
> >         NS_MAX 25
> >       &END EWALD
> >     &END POISSON
> >   &END MM
> >   &SUBSYS
> >     &CELL
> >       ABC 24.955 24.955 24.955
> >     &END CELL
> >     &COORD
> > OT   -0.757  -5.616  -7.101    MOL1
> > HT   -1.206  -5.714  -6.262    MOL1
> > HT    0.024  -5.102  -6.896    MOL1
> > OT  -11.317  -2.629  -9.689    MOL2
> > HT  -11.021  -3.080 -10.480    MOL2
> > HT  -10.511  -2.355  -9.252    MOL2
> >     &END
> >   &END SUBSYS
> > &END FORCE_EVAL
> > &GLOBAL
> >   PROJECT water_3_dist
> >   RUN_TYPE ENERGY_FORCE
> > &END GLOBAL
>
> > If the poisson section is substituted with this one:
>
> >     &POISSON
> >       &EWALD
> >         EWALD_TYPE pme
> >         ALPHA .44
> >         NS_MAX 25
> >         &RS_GRID
> >          DISTRIBUTION_TYPE REPLICATED
> >         &END
> >       &END EWALD
> >     &END POISSON
>
> > where the distribution_type is set to replicated the bug disappears. So it is triggered by a certain combination of grid points and by the distributed type (which automatically is activated for 16 procs - 2..8 procs instead have replicated type).
> > All module which use RS_GRID (FIST is just one)  may be affected by this bug, including QS.
>
> > Regards,
> > Teo
>
> > On Apr 19, 2011, at 3:37 PM, Noam Bernstein wrote:
>
> > > We were playing around with the regtests, specifically
> > >   cp2k/tests/QMMM/QS/regtest-3/water_3_dist.inp
> > > On the latest version (cvs update today 19 April), no patches, I get
> > > different results running on n_proc=2..8 and n_proc=16.  The
> > > difference seems to happen when RS_GRID goes from fully replicated
> > > to distributed.  I've attached the input files and two sample output
> > > files (I changed from QMMM to FIST).  The first output quantity difference
> > > is on the
> > >    ENERGY| Total FORCE_EVAL ( FIST ) energy (a.u.):
> > > energies are off by 8 mRy.  Can someone replicate this issue (I suppose
> > > I could have compiler or MPI issues, although we think we see this error
> > > on several somewhat different Linux platforms)?  If so, any ideas
> > > as to the source of the problem?
>
> > >     thanks,
> > >     Noam
>
> > > --
> > > You received this message because you are subscribed to the Google Groups "cp2k" group.
> > > To post to this group, send email to cp... at googlegroups.com.
> > > To unsubscribe from this group, send email to cp2k+uns... at googlegroups.com.
> > > For more options, visit this group athttp://groups.google.com/group/cp2k?hl=en.
>
> > > <out.2><out.16><water_3_dist.inp>


More information about the CP2K-user mailing list