|
|
URL: [http://www.nwchem-sw.org/](http://www.nwchem-sw.org/)
|
|
|
URL: [http://www.nwchem-sw.org](http://www.nwchem-sw.org)
|
|
|
|
|
|
<!-- The following line controls how the package lists are generated -->
|
|
|
Categories: [application](/categories/application), [open-source](/categories/open-source)
|
|
|
Categories: [open-source](/categories/open-source), [application](/categories/application),
|
|
|
|
|
|
NWChem: Open Source High-Performance Computational Chemistry. NWChem aims to provide its users with computational chemistry tools that are scalable both in their ability to treat large scientific computational chemistry problems efficiently, and in their use of available parallel computing resources from high-performance parallel supercomputers to conventional workstation clusters.
|
|
|
|
|
|
Chemistry; NWChem is an ab initio computational chemistry software package which also includes quantum chemical and molecular dynamics functionality.
|
|
|
<!-- Label: CompilesARMCompiler=Yes -->
|
|
|
<!-- Label: CompilesGCC=Yes -->
|
|
|
<!-- Label: BuildMaturity=Upstream -->
|
|
|
|
|
|
# Versions
|
|
|
[NWChem 6.8 ARM](#build-details-for-version-68-arm) -
|
|
|
[NWChem 6.8 GCC](#build-details-for-version-68-gcc)
|
|
|
|
|
|
# Build Details For Version 6.8 Arm
|
|
|
|
|
|
From a talk in the KNL workshop at ISC2017, the following characteristics of NWChem were highlighted:
|
|
|
## Configuration
|
|
|
|
|
|
* Heavy on FFT & DGEMM (tall and skinny matrices)
|
|
|
* Strong scaling is Key
|
|
|
* Difficult 3D FFT that scales to 2K nodes
|
|
|
* 3D FFT: 1D FFT + rotate cube + 1D FFT + rotate cube + 1D FFT
|
|
|
* Asynchronous MPI (overlap comp, comm) + threading for 1D FFTs
|
|
|
* MKL deem perf or Tall and Skinny matrix is bad, they do threading themselves and use single threaded MKL dgemm
|
|
|
* One MPI per node and then OpenMP |
|
|
1. NWChem 6.8
|
|
|
2. Arm compiler version 18.0
|
|
|
3. Arm Perforamce Libraries 18.0
|
|
|
4. Open MPI version 2.1.2
|
|
|
5. Tested on TX2 running Ubuntu 16.04
|
|
|
6. Last updated 31/01/18
|
|
|
|
|
|
## Build instructions
|
|
|
|
|
|
### Downloading and unpack the packages
|
|
|
```
|
|
|
wget https://github.com/nwchemgit/nwchem/releases/download/v6.8-release/nwchem-6.8-release.revision-v6.8-47-gdf6c956-src.2017-12-14.tar.bz2
|
|
|
|
|
|
# Unpack tar file of src
|
|
|
tar jxf wchem-6.8-release.revision-v6.8-47-gdf6c956-src.2017-12-14.tar.bz2
|
|
|
|
|
|
# set NWCHEM_TOP to <your path>/nwchem-6.8
|
|
|
export NWCHEM_TOP=`pwd`/nwchem-6.8
|
|
|
|
|
|
```
|
|
|
|
|
|
### Compiler configuration
|
|
|
```
|
|
|
export CC=armclang
|
|
|
export CXX=armclang++
|
|
|
export FC=armflang
|
|
|
```
|
|
|
|
|
|
### Modify src
|
|
|
Replace the makefile.h in $NWCHEM_TOP/src/config with [this] or apply the following as a patch (from the $NWCHEM_TOP directory).
|
|
|
```
|
|
|
cd $NWCHEM_TOP
|
|
|
patch -p0 <<EOF
|
|
|
--- src/config/makefile.h 2018-02-02 10:30:28.399555033 +0000
|
|
|
+++ src/config/makefile.h.arm 2018-02-02 10:29:47.485239331 +0000
|
|
|
@@ -1592,6 +1592,9 @@
|
|
|
ifeq (\$(CC),icc)
|
|
|
_CC=icc
|
|
|
endif
|
|
|
+ ifeq (\$(CC),armclang)
|
|
|
+ _CC=armclang
|
|
|
+ endif
|
|
|
ifeq (\$(FC),pgf90)
|
|
|
_FC=pgf90
|
|
|
endif
|
|
|
@@ -1613,6 +1616,9 @@
|
|
|
ifeq (\$(FC),xlf)
|
|
|
_FC=xlf
|
|
|
endif
|
|
|
+ ifeq (\$(FC),armflang)
|
|
|
+ _FC=armflang
|
|
|
+ endif
|
|
|
ifndef _FC
|
|
|
FC=gfortran
|
|
|
_FC=gfortran
|
|
|
@@ -2141,6 +2147,92 @@
|
|
|
endif
|
|
|
endif
|
|
|
|
|
|
+ ifeq (\$(_CPU),\$(findstring \$(_CPU),aarch64))
|
|
|
+
|
|
|
+ ifdef USE_DEBUG
|
|
|
+ FOPTIONS += -g
|
|
|
+ COPTIONS += -g
|
|
|
+ LDOPTIONS += -g
|
|
|
+ LDFLAGS += -g
|
|
|
+ endif
|
|
|
+
|
|
|
+ ifeq (\$(_CC),gcc)
|
|
|
+ COPTIONS += -O3 -funroll-loops -ffast-math
|
|
|
+ ifdef USE_OPENMP
|
|
|
+ COPTIONS += -fopenmp
|
|
|
+ endif
|
|
|
+ endif
|
|
|
+
|
|
|
+ ifeq (\$(_FC),gfortran)
|
|
|
+ #gcc version 4.1.0 20050525 (experimental)
|
|
|
+ ifdef USE_GPROF
|
|
|
+ FOPTIONS += -pg
|
|
|
+ COPTIONS += -pg
|
|
|
+ LDOPTIONS += -pg
|
|
|
+ LDFLAGS += -pg
|
|
|
+ endif
|
|
|
+ LINK.f = \$(FC) \$(LDFLAGS)
|
|
|
+ FOPTIMIZE += -O3
|
|
|
+ ifeq (\$(GNU_GE_6),true)
|
|
|
+ FOPTIMIZE += -fno-tree-dominator-opts # solvation/hnd_cosmo_lib breaks
|
|
|
+ FOPTIONS += -fno-tree-dominator-opts # solvation/hnd_cosmo_lib breaks
|
|
|
+ FDEBUG += -fno-tree-dominator-opts # solvation/hnd_cosmo_lib breaks
|
|
|
+ endif
|
|
|
+
|
|
|
+ FOPTIMIZE += -fprefetch-loop-arrays #-ftree-loop-linear
|
|
|
+ ifeq (\$(GNU_GE_4_8),true)
|
|
|
+ FOPTIMIZE += -ftree-vectorize -fopt-info-vec
|
|
|
+ endif
|
|
|
+
|
|
|
+ FDEBUG += -g -O
|
|
|
+ ifdef USE_F2C
|
|
|
+ #possible segv with use of zdotc (e.g. with GOTO BLAS)
|
|
|
+ #http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20178
|
|
|
+ FOPTIONS += -ff2c -fno-second-underscore
|
|
|
+ endif
|
|
|
+ ifeq (\$(GNU_GE_4_6),true)
|
|
|
+ FOPTIMIZE += -mtune=native
|
|
|
+ FOPTIONS += -finline-functions
|
|
|
+ endif
|
|
|
+ ifndef USE_FPE
|
|
|
+ FOPTIMIZE += -ffast-math #2nd time
|
|
|
+ endif
|
|
|
+ ifdef USE_FPE
|
|
|
+ FOPTIONS += -ffpe-trap=invalid,zero,overflow -fbacktrace
|
|
|
+ endif
|
|
|
+ endif # end of gfortran
|
|
|
+
|
|
|
+ ifeq (\$(_FC),armflang)
|
|
|
+
|
|
|
+ ifdef USE_SHARED
|
|
|
+ FOPTIONS+= -fPIC
|
|
|
+ endif
|
|
|
+
|
|
|
+ DEFINES += -DARMFLANG
|
|
|
+ LINK.f = \$(FC) \$(LDFLAGS)
|
|
|
+ #FOPTIMIZE += -O3
|
|
|
+ FOPTIMIZE = -O3 -Mfma -ffp-contract=fast
|
|
|
+ # not in armflang
|
|
|
+ # FOPTIMIZE += -fprefetch-loop-arrays #-ftree-loop-linear
|
|
|
+ ifeq (\$(GNU_GE_4_8),true)
|
|
|
+ # not in armflang
|
|
|
+ FOPTIMIZE += -ftree-vectorize -fopt-info-vec
|
|
|
+ endif
|
|
|
+
|
|
|
+ FDEBUG += -g -O
|
|
|
+ FOPTIMIZE += -mtune=native
|
|
|
+ # not in armflang
|
|
|
+ # FOPTIONS += -finline-functions
|
|
|
+
|
|
|
+ ifndef USE_FPE
|
|
|
+ FOPTIMIZE += -ffast-math #2nd time
|
|
|
+ endif
|
|
|
+ ifdef USE_FPE
|
|
|
+ FOPTIONS += -ffpe-trap=invalid,zero,overflow -fbacktrace
|
|
|
+ endif
|
|
|
+ endif
|
|
|
+ endif # end of aarch64
|
|
|
+
|
|
|
ifeq (\$(_CPU),\$(findstring \$(_CPU), ppc64 ppc64le))
|
|
|
# Tested on Red Hat Enterprise Linux AS release 3 (Taroon Update 3)
|
|
|
# Tested on SLES 9
|
|
|
EOF
|
|
|
|
|
|
```
|
|
|
|
|
|
Modify src/inp/inp.F
|
|
|
```
|
|
|
sed -i 's/#if defined(DECOSF)/#if defined(ARMFLANG) || defined(DECOSF)/' $NWCHEM_TOP/src/inp/inp.F
|
|
|
```
|
|
|
|
|
|
### Build configuration
|
|
|
```
|
|
|
cd $NWCHEM_TOP
|
|
|
|
|
|
# Set up NWChem env vars
|
|
|
# $NWCHEM_TARGET defines your target platform, e.g.
|
|
|
export NWCHEM_TARGET=LINUX64
|
|
|
|
|
|
# ARMCI_NETWORK must be defined in order to achieve best performance on high performance networks, e.g.
|
|
|
export ARMCI_NETWORK=MPI-PR
|
|
|
|
|
|
# Setup MPI PATHS
|
|
|
# Set to "y" to indicate that NWChem should be compiled with MPI
|
|
|
export USE_MPI=y
|
|
|
# Set to "y" for the NWPW module to use fortran-bindings of MPI (Generally set when USE_MPI is set)
|
|
|
export USE_MPIF=y
|
|
|
# Set to "y" for the NWPW module to use Integer*4 fortran-bindings of MPI. (Generally set when USE_MPI is set on most platforms)
|
|
|
export USE_MPIF4=y
|
|
|
|
|
|
# You can try to run ${NWCHEM_TOP}/src/tools/guess-mpidefs to guess the values for these MPI defs
|
|
|
export MPI_INCLUDE="</path/to/mpi/includes>"
|
|
|
export MPI_LIB="</path/to/mpi/libs/dir>"
|
|
|
export LIBMPI="<list of mpi libraries>"
|
|
|
|
|
|
# NWCHEM_MODULES defines the modules to be compiled
|
|
|
export NWCHEM_MODULES="all"
|
|
|
|
|
|
# Optimized armpl math libraries - use ilp64 since NWChem uses 64 bit integers on a 64bit systems
|
|
|
# export BLAS_LIBS="-L</path/to/arm/performace/libraries/dir> -larmpl_ilp64"
|
|
|
# If you have used the provided module then you can just use
|
|
|
export BLAS_LIBS="-L$ARMPL_DIR/lib -larmpl_ilp64"
|
|
|
export BLASOPT=${BLAS_LIBS}
|
|
|
export BLAS_SIZE=8
|
|
|
|
|
|
# Same for LAPACK
|
|
|
# export LAPACK="-L</path/to/arm/performace/libraries/dir> -larmpl_ilp64"
|
|
|
# If you have used the provided module then you can just use
|
|
|
export LAPACK="-L$ARMPL_DIR/lib -larmpl_ilp64"
|
|
|
export LAPACK_LIBS="$LAPACK"
|
|
|
export LAPACK_LIB="$LAPACK"
|
|
|
export LAPACK_SIZE=8
|
|
|
|
|
|
# possibly need this for the configure step for GA ?
|
|
|
FFLAG_INT=-i8
|
|
|
|
|
|
```
|
|
|
|
|
|
### Build and install
|
|
|
```
|
|
|
cd $NWCHEM_TOP/src
|
|
|
make nwchem_config
|
|
|
make -j 28 | tee make.log
|
|
|
```
|
|
|
|
|
|
### Testing
|
|
|
```
|
|
|
# Run some QA tests
|
|
|
#--------------------
|
|
|
cd $NWCHEM_TOP/QA
|
|
|
export NWCHEM_EXECUTABLE=$NWCHEM_TOP/bin/${NWCHEM_TARGET}/nwchem
|
|
|
./doqmtests.mpi 16 | tee doqmtests.mpi.log
|
|
|
```
|
|
|
There are a number of failed test cases, see notes note at [end](#testing).
|
|
|
|
|
|
# Build Details For Version 6.8 GCC
|
|
|
|
|
|
## Configuration
|
|
|
|
|
|
1. NWChem 6.8
|
|
|
2. GCC compiler version 7.1
|
|
|
3. Arm Perforamce Libraries 18.0
|
|
|
4. Open MPI version 2.1.2
|
|
|
5. Tested on TX2 running Ubuntu 16.04
|
|
|
6. Last updated 31/01/18
|
|
|
|
|
|
|
|
|
## Build instructions
|
|
|
|
|
|
### Downloading and unpack the packages
|
|
|
```
|
|
|
wget https://github.com/nwchemgit/nwchem/releases/download/v6.8-release/nwchem-6.8-release.revision-v6.8-47-gdf6c956-src.2017-12-14.tar.bz2
|
|
|
|
|
|
# Unpack tar file of src
|
|
|
tar jxf wchem-6.8-release.revision-v6.8-47-gdf6c956-src.2017-12-14.tar.bz2
|
|
|
|
|
|
# set NWCHEM_TOP to <your path>/nwchem-6.8
|
|
|
export NWCHEM_TOP=`pwd`/nwchem-6.8
|
|
|
|
|
|
```
|
|
|
|
|
|
### Compiler configuration
|
|
|
```
|
|
|
export CC=gcc
|
|
|
export CXX=g++
|
|
|
export FC=gfortran
|
|
|
```
|
|
|
|
|
|
### Modify src
|
|
|
Replace the makefile.h in $NWCHEM_TOP/src/config with [this] or apply the following as a patch (from the $NWCHEM_TOP directory).
|
|
|
```
|
|
|
cd $NWCHEM_TOP
|
|
|
patch -p0 <<EOF
|
|
|
--- src/config/makefile.h 2018-02-02 10:30:28.399555033 +0000
|
|
|
+++ src/config/makefile.h.arm 2018-02-02 10:29:47.485239331 +0000
|
|
|
@@ -1592,6 +1592,9 @@
|
|
|
ifeq (\$(CC),icc)
|
|
|
_CC=icc
|
|
|
endif
|
|
|
+ ifeq (\$(CC),armclang)
|
|
|
+ _CC=armclang
|
|
|
+ endif
|
|
|
ifeq (\$(FC),pgf90)
|
|
|
_FC=pgf90
|
|
|
endif
|
|
|
@@ -1613,6 +1616,9 @@
|
|
|
ifeq (\$(FC),xlf)
|
|
|
_FC=xlf
|
|
|
endif
|
|
|
+ ifeq (\$(FC),armflang)
|
|
|
+ _FC=armflang
|
|
|
+ endif
|
|
|
ifndef _FC
|
|
|
FC=gfortran
|
|
|
_FC=gfortran
|
|
|
@@ -2141,6 +2147,92 @@
|
|
|
endif
|
|
|
endif
|
|
|
|
|
|
+ ifeq (\$(_CPU),\$(findstring \$(_CPU),aarch64))
|
|
|
+
|
|
|
+ ifdef USE_DEBUG
|
|
|
+ FOPTIONS += -g
|
|
|
+ COPTIONS += -g
|
|
|
+ LDOPTIONS += -g
|
|
|
+ LDFLAGS += -g
|
|
|
+ endif
|
|
|
+
|
|
|
+ ifeq (\$(_CC),gcc)
|
|
|
+ COPTIONS += -O3 -funroll-loops -ffast-math
|
|
|
+ ifdef USE_OPENMP
|
|
|
+ COPTIONS += -fopenmp
|
|
|
+ endif
|
|
|
+ endif
|
|
|
+
|
|
|
+ ifeq (\$(_FC),gfortran)
|
|
|
+ #gcc version 4.1.0 20050525 (experimental)
|
|
|
+ ifdef USE_GPROF
|
|
|
+ FOPTIONS += -pg
|
|
|
+ COPTIONS += -pg
|
|
|
+ LDOPTIONS += -pg
|
|
|
+ LDFLAGS += -pg
|
|
|
+ endif
|
|
|
+ LINK.f = \$(FC) \$(LDFLAGS)
|
|
|
+ FOPTIMIZE += -O3
|
|
|
+ ifeq (\$(GNU_GE_6),true)
|
|
|
+ FOPTIMIZE += -fno-tree-dominator-opts # solvation/hnd_cosmo_lib breaks
|
|
|
+ FOPTIONS += -fno-tree-dominator-opts # solvation/hnd_cosmo_lib breaks
|
|
|
+ FDEBUG += -fno-tree-dominator-opts # solvation/hnd_cosmo_lib breaks
|
|
|
+ endif
|
|
|
+
|
|
|
+ FOPTIMIZE += -fprefetch-loop-arrays #-ftree-loop-linear
|
|
|
+ ifeq (\$(GNU_GE_4_8),true)
|
|
|
+ FOPTIMIZE += -ftree-vectorize -fopt-info-vec
|
|
|
+ endif
|
|
|
+
|
|
|
+ FDEBUG += -g -O
|
|
|
+ ifdef USE_F2C
|
|
|
+ #possible segv with use of zdotc (e.g. with GOTO BLAS)
|
|
|
+ #http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20178
|
|
|
+ FOPTIONS += -ff2c -fno-second-underscore
|
|
|
+ endif
|
|
|
+ ifeq (\$(GNU_GE_4_6),true)
|
|
|
+ FOPTIMIZE += -mtune=native
|
|
|
+ FOPTIONS += -finline-functions
|
|
|
+ endif
|
|
|
+ ifndef USE_FPE
|
|
|
+ FOPTIMIZE += -ffast-math #2nd time
|
|
|
+ endif
|
|
|
+ ifdef USE_FPE
|
|
|
+ FOPTIONS += -ffpe-trap=invalid,zero,overflow -fbacktrace
|
|
|
+ endif
|
|
|
+ endif # end of gfortran
|
|
|
+
|
|
|
+ ifeq (\$(_FC),armflang)
|
|
|
+
|
|
|
+ ifdef USE_SHARED
|
|
|
+ FOPTIONS+= -fPIC
|
|
|
+ endif
|
|
|
+
|
|
|
+ DEFINES += -DARMFLANG
|
|
|
+ LINK.f = \$(FC) \$(LDFLAGS)
|
|
|
+ #FOPTIMIZE += -O3
|
|
|
+ FOPTIMIZE = -O3 -Mfma -ffp-contract=fast
|
|
|
+ # not in armflang
|
|
|
+ # FOPTIMIZE += -fprefetch-loop-arrays #-ftree-loop-linear
|
|
|
+ ifeq (\$(GNU_GE_4_8),true)
|
|
|
+ # not in armflang
|
|
|
+ FOPTIMIZE += -ftree-vectorize -fopt-info-vec
|
|
|
+ endif
|
|
|
+
|
|
|
+ FDEBUG += -g -O
|
|
|
+ FOPTIMIZE += -mtune=native
|
|
|
+ # not in armflang
|
|
|
+ # FOPTIONS += -finline-functions
|
|
|
+
|
|
|
+ ifndef USE_FPE
|
|
|
+ FOPTIMIZE += -ffast-math #2nd time
|
|
|
+ endif
|
|
|
+ ifdef USE_FPE
|
|
|
+ FOPTIONS += -ffpe-trap=invalid,zero,overflow -fbacktrace
|
|
|
+ endif
|
|
|
+ endif
|
|
|
+ endif # end of aarch64
|
|
|
+
|
|
|
ifeq (\$(_CPU),\$(findstring \$(_CPU), ppc64 ppc64le))
|
|
|
# Tested on Red Hat Enterprise Linux AS release 3 (Taroon Update 3)
|
|
|
# Tested on SLES 9
|
|
|
EOF
|
|
|
|
|
|
```
|
|
|
|
|
|
### Build configuration
|
|
|
```
|
|
|
cd $NWCHEM_TOP
|
|
|
|
|
|
# Set up NWChem env vars
|
|
|
# $NWCHEM_TARGET defines your target platform, e.g.
|
|
|
export NWCHEM_TARGET=LINUX64
|
|
|
|
|
|
# ARMCI_NETWORK must be defined in order to achieve best performance on high performance networks, e.g.
|
|
|
export ARMCI_NETWORK=MPI-PR
|
|
|
|
|
|
# Setup MPI PATHS
|
|
|
# Set to "y" to indicate that NWChem should be compiled with MPI
|
|
|
export USE_MPI=y
|
|
|
# Set to "y" for the NWPW module to use fortran-bindings of MPI (Generally set when USE_MPI is set)
|
|
|
export USE_MPIF=y
|
|
|
# Set to "y" for the NWPW module to use Integer*4 fortran-bindings of MPI. (Generally set when USE_MPI is set on most platforms)
|
|
|
export USE_MPIF4=y
|
|
|
|
|
|
# You can try to run ${NWCHEM_TOP}/src/tools/guess-mpidefs to guess the values for these MPI defs
|
|
|
export MPI_INCLUDE="</path/to/mpi/includes>"
|
|
|
export MPI_LIB="</path/to/mpi/libs/dir>"
|
|
|
export LIBMPI="<list of mpi libraries>"
|
|
|
|
|
|
# NWCHEM_MODULES defines the modules to be compiled
|
|
|
export NWCHEM_MODULES="all"
|
|
|
|
|
|
# Optimized armpl math libraries - use ilp64 since NWChem uses 64 bit integers on a 64bit systems
|
|
|
# export BLAS_LIBS="-L</path/to/arm/performace/libraries/dir> -larmpl_ilp64"
|
|
|
# If you have used the provided module to configure armpl then you can just use
|
|
|
export BLAS_LIBS="-L$ARMPL_DIR/lib -larmpl_ilp64"
|
|
|
export BLASOPT=${BLAS_LIBS}
|
|
|
export BLAS_SIZE=8
|
|
|
|
|
|
# Same for LAPACK
|
|
|
# export LAPACK="-L</path/to/arm/performace/libraries/dir> -larmpl_ilp64"
|
|
|
# If you have used the provided module to configure armpl then you can just use
|
|
|
export LAPACK="-L$ARMPL_DIR/lib -larmpl_ilp64"
|
|
|
export LAPACK_LIBS="$LAPACK"
|
|
|
export LAPACK_LIB="$LAPACK"
|
|
|
export LAPACK_SIZE=8
|
|
|
```
|
|
|
|
|
|
### Build and install
|
|
|
```
|
|
|
cd $NWCHEM_TOP/src
|
|
|
make nwchem_config
|
|
|
make -j 28 | tee make.log
|
|
|
```
|
|
|
|
|
|
### Testing
|
|
|
```
|
|
|
# Run some QA tests
|
|
|
#--------------------
|
|
|
cd $NWCHEM_TOP/QA
|
|
|
export NWCHEM_EXECUTABLE=$NWCHEM_TOP/bin/${NWCHEM_TARGET}/nwchem
|
|
|
./doqmtests.mpi 16 | tee doqmtests.mpi.log
|
|
|
```
|
|
|
There are a number of failed test cases, see notes note at [end](#testing).
|
|
|
|
|
|
# Testing
|
|
|
|
|
|
The test script doqmtests.mpi has been used to test these builds on a TX2 system. Currently there are a number of failed tests (currently 44 from 240 tests) many of which are for minor numerical differences. Further details regarding this will be provided here shortly. |