FIST bug?
Matt W
mattwa... at gmail.com
Wed Apr 20 11:37:30 UTC 2011
Ok change lines 164 and 167 of pme.F to
IF ( rden % desc % parallel .AND. rden % desc % distributed ) THEN
and
IF ( PRESENT(shell_particle_set) .AND. rden % desc %
parallel .AND. rden % desc % distributed ) THEN
respectively. It was using an outdated and risky check to see if the
RS grids were distributed.
Cheers,
Matt
On Apr 20, 12:01 pm, Matt W <mattwa... at gmail.com> wrote:
> Hi Guys,
>
> I've just had a very quick look and I think it is a PME bug. Not SPME,
> not QS. Particularly, the QS RS grids are pretty well tested.
>
> If I change the EWALD section to
>
> &EWALD
> EWALD_TYPE spme
> ALPHA .44
> NS_MAX 25
> GMAX 64 64 64
> &END EWALD
>
> then there is no problem with 8,16 32 procs. All give the same energy
>
> out_spme_8.out: ENERGY| Total FORCE_EVAL ( FIST ) energy (a.u.):
> -0.000148559882818
> out_spme_16.out: ENERGY| Total FORCE_EVAL ( FIST ) energy
> (a.u.): -0.000148559882818
> out_spme_32.out: ENERGY| Total FORCE_EVAL ( FIST ) energy
> (a.u.): -0.000148559882818
>
> I believe that the PME uses a second smaller grid for some purposes -
> I would guess that there is some inconsistency in the treatment of the
> larger and smaller grids. I'll try and have a proper look but people
> who know the PME code might spot it more quickly.
>
> In summary: there is a bug with PME and at least some distributed
> grids. But no evidence for problems with other methods.
>
> Cheers,
>
> Matt
>
> (previous mail to the group got bounced so reposting via google gui)
>
> On Apr 20, 6:00 am, Teodoro Laino <teodor... at gmail.com> wrote:
>
> > Hi Noam,
>
> > I can confirm your bug report. The issue is related to the relatively new distribution RS_GRID (a couple of years old..).
> > This means that the same issue could be present also in QS jobs (they share the same infrastructure).
>
> > Matt who worked on this stuff is in CC. The issue can be reproduced with this simple input file:
>
> > &FORCE_EVAL
> > METHOD FIST
> > &MM
> > &FORCEFIELD
> > parm_file_name water.pot ! (just use the one in tests/Fist/sample_pot/)
> > parmtype CHM
> > &CHARGE
> > ATOM OT
> > CHARGE -0.8476
> > &END CHARGE
> > &CHARGE
> > ATOM HT
> > CHARGE 0.4238
> > &END CHARGE
> > &END FORCEFIELD
> > &POISSON
> > &EWALD
> > EWALD_TYPE pme
> > ALPHA .44
> > NS_MAX 25
> > &END EWALD
> > &END POISSON
> > &END MM
> > &SUBSYS
> > &CELL
> > ABC 24.955 24.955 24.955
> > &END CELL
> > &COORD
> > OT -0.757 -5.616 -7.101 MOL1
> > HT -1.206 -5.714 -6.262 MOL1
> > HT 0.024 -5.102 -6.896 MOL1
> > OT -11.317 -2.629 -9.689 MOL2
> > HT -11.021 -3.080 -10.480 MOL2
> > HT -10.511 -2.355 -9.252 MOL2
> > &END
> > &END SUBSYS
> > &END FORCE_EVAL
> > &GLOBAL
> > PROJECT water_3_dist
> > RUN_TYPE ENERGY_FORCE
> > &END GLOBAL
>
> > If the poisson section is substituted with this one:
>
> > &POISSON
> > &EWALD
> > EWALD_TYPE pme
> > ALPHA .44
> > NS_MAX 25
> > &RS_GRID
> > DISTRIBUTION_TYPE REPLICATED
> > &END
> > &END EWALD
> > &END POISSON
>
> > where the distribution_type is set to replicated the bug disappears. So it is triggered by a certain combination of grid points and by the distributed type (which automatically is activated for 16 procs - 2..8 procs instead have replicated type).
> > All module which use RS_GRID (FIST is just one) may be affected by this bug, including QS.
>
> > Regards,
> > Teo
>
> > On Apr 19, 2011, at 3:37 PM, Noam Bernstein wrote:
>
> > > We were playing around with the regtests, specifically
> > > cp2k/tests/QMMM/QS/regtest-3/water_3_dist.inp
> > > On the latest version (cvs update today 19 April), no patches, I get
> > > different results running on n_proc=2..8 and n_proc=16. The
> > > difference seems to happen when RS_GRID goes from fully replicated
> > > to distributed. I've attached the input files and two sample output
> > > files (I changed from QMMM to FIST). The first output quantity difference
> > > is on the
> > > ENERGY| Total FORCE_EVAL ( FIST ) energy (a.u.):
> > > energies are off by 8 mRy. Can someone replicate this issue (I suppose
> > > I could have compiler or MPI issues, although we think we see this error
> > > on several somewhat different Linux platforms)? If so, any ideas
> > > as to the source of the problem?
>
> > > thanks,
> > > Noam
>
> > > --
> > > You received this message because you are subscribed to the Google Groups "cp2k" group.
> > > To post to this group, send email to cp... at googlegroups.com.
> > > To unsubscribe from this group, send email to cp2k+uns... at googlegroups.com.
> > > For more options, visit this group athttp://groups.google.com/group/cp2k?hl=en.
>
> > > <out.2><out.16><water_3_dist.inp>
More information about the CP2K-user
mailing list