[CP2K:2353] Re: NaN after cvs update !!!!

Juerg Hutter hut... at pci.uzh.ch
Fri Oct 23 11:50:53 UTC 2009


Hi

there is a known bug in the new sparse matrix code that
is triggered by a certain processor number and system
size. Your problem could be caused by this bug.
We are working on a fix.

regards

Juerg

----------------------------------------------------------
Juerg Hutter                   Phone : ++41 44 635 4491
Physical Chemistry Institute   FAX   : ++41 44 635 6838
University of Zurich           E-mail: hut... at pci.uzh.ch
Winterthurerstrasse 190
CH-8057 Zurich, Switzerland
----------------------------------------------------------


On Fri, 23 Oct 2009, salah wrote:

>
> Teo,
> thanks for the answer. I've seen Axel's comment on the problem but I
> was hoping that somebody worked something out during the last two
> weeks. The reason why I'm struggling with Intel compiler on my quad is
> another more complicated problem on Altix.
> The good news is that cp2k compiles and runs well on PC-Farm of AMD-
> Opterons with Intel compilers.
>
> Thanks again,
>
> Salah
>
> On 23 oct, 12:54, Teodoro Laino <teodor... at gmail.com> wrote:
>> Salah,
>>
>> there is a serious problem with Intel compiler (as far as I know there
>> is no version which is compiling cleanly cp2k after the introduction of
>> the new sparse matrix type). The problem is located on several
>> architectures (included quad-core opteron).
>> I'm aware of a couple of guys that are trying to see if it is possible
>> to find a workaround for this issue but no news at the moment.
>> The hope is that either Intel will come out with a bug fix or somebody
>> will patch the problem.
>>
>> In the meanwhile, I would recommend you to follow what Axel suggested
>> (see his old (a couple of weeks ago) posts).
>> Teo
>>
>> salah wrote:
>>> Hi everybody,
>>> I keep a copy of cp2k on a local machine with a quad proc. for small
>>> jobs and testing.
>>> This week, I've cvs-updated cp2k and recompiled it and suddenly the
>>> code gives NaNs. The arch file and the environment is the same though.
>>> 1-2 months ago, both compilation and testing were fine.
>>> Updating Intel compiler to the actual version (11.1.056) is more
>>> messy.
>>> Any comment or help is acknowledged in advance.
>>
>>> Salah
>>
>>> (Details: Intel quad Q9550 / Intel compiler 10.1.022 / acml 4.2.0 /
>>> balcs, and scalapack are self compiled)
>>
>>> her comes the arch file:
>>> #################
>>> INC      = /usr/local/include
>>> MyLIBS   = /home/salah/calc/libs/LIBS
>>> ACML     = /opt/acml4.2.0/ifort64/lib
>>> CC       = cc
>>> CPP      =
>>> FC       = mpif90
>>> LD       = mpif90
>>> AR       = ar -r
>>> DFLAGS   = -D__INTEL -D__FFTSG -D__parallel -D__BLACS -D__SCALAPACK -
>>> D__FFTW3
>>> CPPFLAGS =
>>> FCFLAGS  = $(DFLAGS) -I$(INC) -O3 -xT -heap-arrays 64 -funroll-loops -
>>> fpp -free
>>> FCFLAGS2 = $(DFLAGS) -I$(INC) -O1 -xT -heap-arrays 64 -fpp -free
>>> LDFLAGS  = $(FCFLAGS) -I$(INC)
>>> LIBS     = $(MyLIBS)/libscalapack.a \
>>>            $(MyLIBS)/libblacs.a \
>>>            $(MyLIBS)/libblacsCinit.a \
>>>            $(MyLIBS)/libblacsF77init.a \
>>>            $(MyLIBS)/libblacs.a \
>>>            $(ACML)/libacml.a\
>>>            $(ACML)/libacml_mv.a \
>>>            -lfftw3
>>
>>> OBJECTS_ARCHITECTURE = machine_intel.o
>>
>>> graphcon.o: graphcon.F
>>>         $(FC) -c $(FCFLAGS2) $<
>>
>>> and the error: (simple test with cp2k/tests/DFTB/regtest-nonscc/
>>> MoS.inp)
>>> ###################
>>> SCF WAVEFUNCTION OPTIMIZATION
>>
>>>   Step     Update method      Time    Convergence         Total
>>> energy    Change
>>
>>> ------------------------------------------------------------------------------
>>>      1 P_Mix/Diag. 0.10E+01    0.0            NaN
>>> 1.1387065762  1.14E+00
>>>      2 P_Mix/Diag. 0.10E+01    0.0            NaN
>>> NaN  NaN
>>>      3 P_Mix/Diag. 0.10E+01    0.0            NaN
>>> NaN  NaN
>>>      4 P_Mix/Diag. 0.10E+01    0.0            NaN
>>> NaN  NaN
>>>      5 P_Mix/Diag. 0.10E+01    0.0            NaN
>>> NaN  NaN
>>>      6 P_Mix/Diag. 0.10E+01    0.0            NaN
>>> NaN  NaN
>>>      7 P_Mix/Diag. 0.10E+01    0.0            NaN
>>> NaN  NaN
>>>      8 P_Mix/Diag. 0.10E+01    0.0            NaN
>>> NaN  NaN
>>>      9 P_Mix/Diag. 0.10E+01    0.0            NaN
>>> NaN  NaN
>>>     10 P_Mix/Diag. 0.10E+01    0.0            NaN
>>> NaN  NaN
>>>     11 P_Mix/Diag. 0.10E+01    0.0            NaN
>>> NaN  NaN
>>>     12 P_Mix/Diag. 0.10E+01    0.0            NaN
>>> NaN  NaN
>>>     13 P_Mix/Diag. 0.10E+01    0.0            NaN
>>> NaN  NaN
>>>     14 P_Mix/Diag. 0.10E+01    0.0            NaN
>>> NaN  NaN
>>>     15 P_Mix/Diag. 0.10E+01    0.0            NaN
>>> NaN  NaN
>>>     16 P_Mix/Diag. 0.10E+01    0.0            NaN
>>> NaN  NaN
>>>     17 P_Mix/Diag. 0.10E+01    0.0            NaN
>>> NaN  NaN
>>>     18 P_Mix/Diag. 0.10E+01    0.0            NaN
>>> NaN  NaN
>>>     19 P_Mix/Diag. 0.10E+01    0.0            NaN
>>> NaN  NaN
>>>     20 P_Mix/Diag. 0.10E+01    0.0            NaN
>>> NaN  NaN
>>>     21 P_Mix/Diag. 0.10E+01    0.0            NaN
>>> NaN  NaN
>>>     22 P_Mix/Diag. 0.10E+01    0.0            NaN
>>> NaN  NaN
>>>     23 P_Mix/Diag. 0.10E+01    0.0            NaN
>>> NaN  NaN
>>>     24 P_Mix/Diag. 0.10E+01    0.0            NaN
>>> NaN  NaN
>>>     25 P_Mix/Diag. 0.10E+01    0.0            NaN
>>> NaN  NaN
>>>     26 P_Mix/Diag. 0.10E+01    0.0            NaN
>>> NaN  NaN
>>>     27 P_Mix/Diag. 0.10E+01    0.0            NaN
>>> NaN  NaN
>>>     28 P_Mix/Diag. 0.10E+01    0.0            NaN
>>> NaN  NaN
>>>     29 P_Mix/Diag. 0.10E+01    0.0            NaN
>>> NaN  NaN
>>>     30 P_Mix/Diag. 0.10E+01    0.0            NaN
>>> NaN  NaN
>>>     31 P_Mix/Diag. 0.10E+01    0.0            NaN
>>> NaN  NaN
>>>     32 P_Mix/Diag. 0.10E+01    0.0            NaN
>>> NaN  NaN
>>>     33 P_Mix/Diag. 0.10E+01    0.0            NaN
>>> NaN  NaN
>>>     34 P_Mix/Diag. 0.10E+01    0.0            NaN
>>> NaN  NaN
>>>     35 P_Mix/Diag. 0.10E+01    0.0            NaN
>>> NaN  NaN
>>>     36 P_Mix/Diag. 0.10E+01    0.0            NaN
>>> NaN  NaN
>>>     37 P_Mix/Diag. 0.10E+01    0.0            NaN
>>> NaN  NaN
>>>     38 P_Mix/Diag. 0.10E+01    0.0            NaN
>>> NaN  NaN
>>>     39 P_Mix/Diag. 0.10E+01    0.0            NaN
>>> NaN  NaN
>>>     40 P_Mix/Diag. 0.10E+01    0.0            NaN
>>> NaN  NaN
>>>     41 P_Mix/Diag. 0.10E+01    0.0            NaN
>>> NaN  NaN
>>>     42 P_Mix/Diag. 0.10E+01    0.0            NaN
>>> NaN  NaN
>>>     43 P_Mix/Diag. 0.10E+01    0.0            NaN
>>> NaN  NaN
>>>     44 P_Mix/Diag. 0.10E+01    0.0            NaN
>>> NaN  NaN
>>>     45 P_Mix/Diag. 0.10E+01    0.0            NaN
>>> NaN  NaN
>>>     46 P_Mix/Diag. 0.10E+01    0.0            NaN
>>> NaN  NaN
>>>     47 P_Mix/Diag. 0.10E+01    0.0            NaN
>>> NaN  NaN
>>>     48 P_Mix/Diag. 0.10E+01    0.0            NaN
>>> NaN  NaN
>>>     49 P_Mix/Diag. 0.10E+01    0.0            NaN
>>> NaN  NaN
>>>     50 P_Mix/Diag. 0.10E+01    0.0            NaN
>>> NaN  NaN
>>
>>>   *** SCF run NOT converged ***
>>
>>>   Core Hamiltonian
>>> energy:                                                   NaN
>>>   Repulsive potential energy:
>>> 1.12349657382487
>>>   Electronic energy:
>>> 0.00000000000000
>>>   Dispersion energy:
>>> 0.01521000233886
>>
>>>   Total
>>> energy:
>>> NaN
>>
>>>  ENERGY| Total FORCE_EVAL ( QS ) energy
>>> (a.u.):                              NaN
> >
>


More information about the CP2K-user mailing list