<br><font size=2 face="sans-serif">As already stated in previous e-mails
on the list - the use of the Mixed MPI/OpenMP scheme it is useful if </font>
<br><font size=2 face="sans-serif"> a) you want to run on many processors
( let's say more processors than the number of realspace planes along
x); </font>
<br><font size=2 face="sans-serif"> b) you want to run on clustered
smp servers going outside a single smp server - when the interconnect latency
and bandwidth are much worse than the the ones for the memory </font>
<br><font size=2 face="sans-serif">c) you have a system with more than
2000 states.</font>
<br>
<br><font size=2 face="sans-serif">Your case is not in these category -
so as explained by Axel - much better to run in
MPI only mode.</font>
<br>
<br><font size=2 face="sans-serif">If you have compiled the code for IBM
POWER machine in the mixed version , you should specify:</font>
<br>
<br><font size=2 face="sans-serif">export XLSMPOPTS=parthds=1
---> to specify the number of smp threads</font>
<br><font size=2 face="sans-serif">export MP_PROCS=32 --->
to specify the number of MPI threads</font>
<br><font size=2 face="sans-serif">export MP_SHARED_MEMORY=YES</font>
<br>
<br>
<br><font size=2 face="sans-serif">In any case - I still suggest
to use norm conserving pseudos for your calculation and test with kpoints
convergency.</font>
<br>
<br>
<br><font size=2 face="sans-serif">Best Regards,</font>
<br><font size=2 face="sans-serif">Alessandro</font>
<br>
<br>
<br>
<br><font size=2 face="sans-serif">Alessandro CURIONI, PhD<br>
Research Staff Member<br>
Computational Biochemistry and Material Science group<br>
IBM Research Division - Zurich Research Laboratory<br>
Saumerstrasse 4<br>
8003 Rueschlikon - Switzerland<br>
e-mail: cur@zurich.ibm.com<br>
www: www.zurich.ibm.com<br>
Tel: +41-1-7248633<br>
Fax: +41-1-7248958<br>
</font>