Commit bc4eaade authored by giannozz's avatar giannozz

Minor additions to the developers guide


git-svn-id: http://qeforge.qe-forge.org/svn/q-e/trunk/[email protected] c92efa57-630b-4861-b058-cf58834340f0
parent f0473bfe
......@@ -677,17 +677,53 @@ convenient in some cases that the two sets are not the same.
In particular, it is often convenient to have \texttt{nrx1}=\texttt{nr1}+1
to reduce memory conflicts.
\section{ Parallelization}
In parallel execution (MPI only), PW starts N independent processes (do not start more than one per processor!) that communicate via calls to MPI libraries. Each process has its own set of variables and knows nothing about other processes' variables. Variables that take little memory are replicated, those that take a lot of memory (wavefunctions, G-vectors, R-space grid) are distributed.
\section{Parallelization}
In parallel execution (MPI only), PW starts N independent processes
(do not start more than one per processor!) that communicate via calls
to MPI libraries. Each process has its own set of variables and knows
nothing about other processes' variables. Variables that take little memory
are replicated, those that take a lot of memory (wavefunctions, G-vectors,
R-space grid) are distributed.
Beware: replicated calculations may either be performed independently on each processor, or performed on one processor and broadcast to all
others. The first approach requires less programming, but it is unsafe: in principle all processors should yield exactly the same results, if they work on the same data, but sometimes they don't (depending on the machine, compiler, and libraries). Even a tiny difference in the last significant digit can eventually cause serious trouble if allowed to build up, especially when a replicated check is performed (in which
case the code may ''hang'' if the check yields different results on different processors). Never assume that the value of a variable produced by replicated calculations is exactly the same on all processors: when in doubt, broadcast the value calculated on a specific processor (the ''root'' processor) to all others.
\subsection{Tricks and pitfalls}
\begin{itemize}
\item
Replicated calculations may either be performed independently on
each processor, or performed on one processor and broadcast to all
others. The first approach requires less programming, but it is unsafe:
in principle all processors should yield exactly the same results, if
they work on the same data, but sometimes they don't (depending on the
machine, compiler, and libraries). Even a tiny difference in the last
significant digit can eventually cause serious trouble if allowed to
build up, especially when a replicated check is performed (in which
case the code may ''hang'' if the check yields different results on
different processors). Never assume that the value of a variable produced
by replicated calculations is exactly the same on all processors: when in
doubt, broadcast the value calculated on a specific processor (the ''root''
processor) to all others.
\item
Routine \texttt{errore} should be called in parallel by all processors,
or else it will hang
\item
I/O operations: file opening, closing, and so on, are as a rule performed
only on processor \texttt{ionode}. The correct way to check for errors is
the following:
\begin{verbatim}
IF ( ionode ) THEN
OPEN ( ..., IOSTAT=ierr )
...
END IF
CALL mp_bcast( ierr, ... )
CALL errore( 'routine','error', ierr )
\end{verbatim}
The same applies to all operations performed on a single processor,
or a subgroup of processors: any error code must be broadcast before
the check.
\end{itemize}
\subsection{Paradigms}
\subsection{Implementation}
\subsubsection{ Data distribution}
\subsection{ Data distribution}
Quantum ESPRESSO employ arrays whose memory requirements fall
into three categories.
......@@ -1165,7 +1201,7 @@ cases: you may easily exceed the stack size if the arrays are large.
pointers may hinder optimization, allocatable arrays should be used instead.
\item If you use pointers, nullify them before performing tests on their
status.
\item Do not pass unallocated arrays ar arguments, even in those cases where
\item Do not pass unallocated arrays as arguments, even in those cases where
they are not actually used inside the subroutine.
\item Do not use any construct that is susceptible to be flagged as
out-of-bounds error, even if no actual out-of-bound error takes place.
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment