Commit 3027b5fb authored by Jason J. Gullickson's avatar Jason J. Gullickson

Initial commit.

parents
*.swp
*.gcode
# RAIN
*Redundant Array of Inexpensive Nodes*
- or -
*Redundant Array of Independent Nodes*
- or -
*Reliable Accessible Integrated Network*
- or - ...
RAIN is an architecture for open, efficient supercomputers designed to make high-performance computing more accessible to progammers, operators and users. RAIN's mission is to make supercomputing accessible to a wider and more diverse audience and encourage the development of new, innovating and compelling high-performance applications.
RAIN systems are broken-out into three categories:
## Mark I
RAIN Mark I (*originally Raiden Mark I*) is a traditional Intel-based, Linux cluster supercomputer. Consisting of 8 ProLiant DL380 servers and a Gigabit Ethernet interconnect, Mark I serves to establish baseline performance, resource consumption and administrative load for typical distributed-memory supercomputers (albeit at a small scale).
## Mark II
Mark II re-creates Mark I using specialized (but still off-the-shelf) parts. Mark II computers are ARM-based 8-node Linux clusters designed to replicate the environment and performance characteristics of Mark I while improving on it by reducing physical size, power consumption and cost. Mark II machines are also a test-bed and development platform for system software, developer tools, utilities and operating system facilities designed to make creating high-performance computing applications accessible to a wider-range of programmers, developers, designers and users.
## Mark III
Mark III machines extend the performance, features and scale of Mark II machines through custom hardware components (compute modules, application-specific logic, interconnects, etc.). Mark III will provide a platform on which RAIN's transition from ARM to [RISC-V](https://en.wikipedia.org/wiki/RISC-V) will take place.
This diff is collapsed.
# RAIN Mark I
## Summary
The goal of Mark I was to create a traditional Linux cluster supercomputer to serve as a baseline of comparison at a scale simular to the Mark II machines. In addition to these metrics, building Mark I helped identify the challenges of assembling, testing and operating a high-performance computing cluster.
The contents of this directory are somewhat disorganized and reflect the learning process of assembling a system such as this. [Build Log.md](Build Log.md) contains a stream-of-conciousness record of the work that went into building Mark I as well as thoughts and ideas for the overall project which came to mind during the build.
## Status
Complete. Even though Mark I has served its purpose, I'd like to run some additional experiments and do more accurate power consumption measurements as well. As such, I plan to keep it assembled until I can get around to that (or until someone is willing to take the hardware off my hands).
File added
==============================================================
List of the known problems with the HPL software
Current as of release HPL - 2.2 - February 24, 2016
==============================================================
==============================================================
==============================================================
======================================================================
-- High Performance Computing Linpack Benchmark (HPL)
HPL - 2.2 - February 24, 2016
Antoine P. Petitet
University of Tennessee, Knoxville
Innovative Computing Laboratory
(C) Copyright 2000-2008 All Rights Reserved
-- Copyright notice and Licensing terms:
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
are met:
1. Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright
notice, this list of conditions, and the following disclaimer in the
documentation and/or other materials provided with the distribution.
3. All advertising materials mentioning features or use of this
software must display the following acknowledgement:
This product includes software developed at the University of
Tennessee, Knoxville, Innovative Computing Laboratory.
4. The name of the University, the name of the Laboratory, or the
names of its contributors may not be used to endorse or promote
products derived from this software without specific written
permission.
-- Disclaimer:
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE UNIVERSITY
OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
DATA OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
======================================================================
==============================================================
High Performance Computing Linpack Benchmark (HPL)
HPL - 2.2 - February 24, 2016
==============================================================
History
- 09/09/00 Public release of Version 1.0
- 09/27/00 A couple of mistakes in the VSIPL port have been
corrected. The tar file as well as the web site were updated
on September 27th, 2000. Note that these problems were not
affecting the BLAS version of the software in any way.
- 01/01/04 Version 1.0a
The MPI process grid numbering scheme is now an run-time
option.
The inlined assembly timer routine that caused the compila-
tion to fail when using gcc version 3.3 and above has been
removed from the package.
Various building problems on the T3E have been fixed; Thanks
to Edward Anderson.
- 15/12/04 Version 1.0b
Weakness of the pseudo-random matrix generator found for pro-
blem sizes being power of twos and larger than 2^15; Thanks
to Gregory Bauer. This problem has not been fixed. It is thus
currently recommended to HPL users willing to test matrices
of size larger than 2^15 to not use power twos.
When the matrix size is such that one needs > 16 GB per MPI
rank, the intermediate calculation (mat.ld+1) * mat.nq in
HPL_pdtest.c ends up overflowing because it is done using
32-bit arithmetic. This issue has been fixed by typecasting
to size_t; Thanks to John Baron.
- 09/10/08 Version 2.0
Piotr Luszczek changed to 64-bit RNG, modified files:
-- [M] include/hpl_matgen.h
-- [M] testing/matgen/HPL_ladd.c
-- [M] testing/matgen/HPL_lmul.c
-- [M] testing/matgen/HPL_rand.c
-- [M] testing/ptest/HPL_pdinfo.c
For a motivation for the change, see:
Dongarra and Langou, ``The Problem with the Linpack
Benchmark Matrix Generator'', LAWN 206, June 2008.
-- [M] testing/ptest/HPL_pdtest.c --
Julien Langou changed the test for correctness from
||Ax-b||_oo / ( eps * ||A||_1 * N )
||Ax-b||_oo / ( eps * ||A||_1 * ||x||_1 )
||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo * N )
to the normwise backward error
|| r ||_oo / ( eps * ( || x ||_oo * || A ||_oo + || b ||_oo ) * N )
See:
Nicholas J. Higham, ``Accuracy and Stability of Numerical Algorithms'',
Society for Industrial and Applied Mathematics, Philadelphia, PA, USA,
Second Edition, pages = xxx+680, ISBN = 0-89871-521-0, 2002.
Note that in our case || b ||_oo is almost for sure
1/2, we compute it anyway.
- 10/26/2012 Version 2.1
Piotr Luszczek introduced exact time stamping for HPL_pdgesv():
-- [M] dist/include/hpl_misc.h
-- [M] dist/testing/ptest/HPL_pdtest.c
Piotr Luszczek fixed out-of-bounds access in data spreading functions
and exact time stamping for HPL_pdgesv():
-- [M] dist/src/pgesv/HPL_spreadN.c
-- [M] dist/src/pgesv/HPL_spreadT.c
Thanks to Stephen Whalen from Cray.
- 02/24/2016 Version 2.2
Piotr Luszczek added continuous reporting of factorization progress
submitted by Intel and make scripts that uses Intel software tools and
libraries and their Apple's Mac OS X equivalents.
==============================================================
High Performance Computing Linpack Benchmark (HPL)
HPL - 2.2 - February 24, 2016
==============================================================
1) Retrieve the tar file, then
gunzip hpl.tgz; tar -xvf hpl.tar
this will create an hpl directory, that we call below the
top-level directory.
2) Create a file Make.<arch> in the top-level directory. For
this purpose, you may want to re-use one contained in the
setup directory. This file essentially contains the compilers
and librairies with their paths to be used.
3) Type "make arch=<arch>". This should create an executable
in the bin/<arch> directory called xhpl.
For example, on our Linux PII cluster, I create a file called
Make.Linux_PII in the top-level directory. Then, I type
"make arch=Linux_PII"
This creates the executable file bin/Linux_PII/xhpl.
4) Quick check: run a few tests:
cd bin/<arch>
mpirun -np 4 xhpl
5) Tuning: Most of the performance parameters can be tuned,
by modifying the input file bin/HPL.dat. See the file TUNING
in the top-level directory.
==============================================================
Compile time options: At the end of the "model" Make.<arch>,
--------------------- the user is given the opportunity to
compile the software with some specific compile options. The
list of this options and their meaning are:
-DHPL_COPY_L
force the copy of the panel L before bcast;
-DHPL_CALL_CBLAS
call the cblas interface;
-DHPL_CALL_VSIPL
call the vsip library;
-DHPL_DETAILED_TIMING
enables detail timers;
The user must choose between either the BLAS Fortran 77
interface, or the BLAS C interface, or the VSIPL library
depending on which computational kernels are available on his
system. Only one of these options should be selected. If you
choose the BLAS Fortran 77 interface, it is necessary to fill
out the machine-specific C to Fortran 77 interface section of
the Make.<arch> file. To do this, please refer to the
Make.<arch> examples contained in the setup directory.
By default HPL will:
*) not copy L before broadcast,
*) call the BLAS Fortran 77 interface,
*) not display detailed timing information.
As an example, suppose one wants HPL to copy the panel of
columns into a contiguous buffer before broadcasting. In
theory, it would be more efficient to let HPL create the
appropriate MPI user-defined data type since this may avoid
the data copy. So, it is a strange idea, but one insists. To
achieve this one would add -DHPL_COPY_L to the definition of
HPL_OPTS at the end of the file Make.<arch>. Issue then a
"make clean arch=<arch>; make build arch=<arch>" and the xhpl
executable will be re-build with that feature in.
==============================================================
Check out the website www.netlib.org/benchmark/hpl for the
latest information.
==============================================================
#
# -- High Performance Computing Linpack Benchmark (HPL)
# HPL - 2.2 - February 24, 2016
# Antoine P. Petitet
# University of Tennessee, Knoxville
# Innovative Computing Laboratory
# (C) Copyright 2000-2008 All Rights Reserved
#
# -- Copyright notice and Licensing terms:
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions
# are met:
#
# 1. Redistributions of source code must retain the above copyright
# notice, this list of conditions and the following disclaimer.
#
# 2. Redistributions in binary form must reproduce the above copyright
# notice, this list of conditions, and the following disclaimer in the
# documentation and/or other materials provided with the distribution.
#
# 3. All advertising materials mentioning features or use of this
# software must display the following acknowledgement:
# This product includes software developed at the University of
# Tennessee, Knoxville, Innovative Computing Laboratory.
#
# 4. The name of the University, the name of the Laboratory, or the
# names of its contributors may not be used to endorse or promote
# products derived from this software without specific written
# permission.
#
# -- Disclaimer:
#
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
# ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE UNIVERSITY
# OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
# DATA OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
# ######################################################################
#
arch = UNKNOWN
#
include Make.$(arch)
#
## build ###############################################################
#
build_src :
( $(CD) src/auxil/$(arch); $(MAKE) )
( $(CD) src/blas/$(arch); $(MAKE) )
( $(CD) src/comm/$(arch); $(MAKE) )
( $(CD) src/grid/$(arch); $(MAKE) )
( $(CD) src/panel/$(arch); $(MAKE) )
( $(CD) src/pauxil/$(arch); $(MAKE) )
( $(CD) src/pfact/$(arch); $(MAKE) )
( $(CD) src/pgesv/$(arch); $(MAKE) )
#
build_tst :
( $(CD) testing/matgen/$(arch); $(MAKE) )
( $(CD) testing/timer/$(arch); $(MAKE) )
( $(CD) testing/pmatgen/$(arch); $(MAKE) )
( $(CD) testing/ptimer/$(arch); $(MAKE) )
( $(CD) testing/ptest/$(arch); $(MAKE) )
#( SPMS_make_cd`' testing/test/$(arch); SPMS_make_make`' )
#
## startup #############################################################
#
startup_dir :
- $(MKDIR) include/$(arch)
- $(MKDIR) lib
- $(MKDIR) lib/$(arch)
- $(MKDIR) bin
- $(MKDIR) bin/$(arch)
#
startup_src :
- $(MAKE) -f Make.top leaf le=src/auxil arch=$(arch)
- $(MAKE) -f Make.top leaf le=src/blas arch=$(arch)
- $(MAKE) -f Make.top leaf le=src/comm arch=$(arch)
- $(MAKE) -f Make.top leaf le=src/grid arch=$(arch)
- $(MAKE) -f Make.top leaf le=src/panel arch=$(arch)
- $(MAKE) -f Make.top leaf le=src/pauxil arch=$(arch)
- $(MAKE) -f Make.top leaf le=src/pfact arch=$(arch)
- $(MAKE) -f Make.top leaf le=src/pgesv arch=$(arch)
#
startup_tst :
- $(MAKE) -f Make.top leaf le=testing/matgen arch=$(arch)
- $(MAKE) -f Make.top leaf le=testing/timer arch=$(arch)
- $(MAKE) -f Make.top leaf le=testing/pmatgen arch=$(arch)
- $(MAKE) -f Make.top leaf le=testing/ptimer arch=$(arch)
- $(MAKE) -f Make.top leaf le=testing/ptest arch=$(arch)
#- SPMS_make_make`' -f Make.top leaf le=testing/test arch=$(arch)
#
## refresh #############################################################
#
refresh_src :
- $(CP) makes/Make.auxil src/auxil/$(arch)/Makefile
- $(CP) makes/Make.blas src/blas/$(arch)/Makefile
- $(CP) makes/Make.comm src/comm/$(arch)/Makefile
- $(CP) makes/Make.grid src/grid/$(arch)/Makefile
- $(CP) makes/Make.panel src/panel/$(arch)/Makefile
- $(CP) makes/Make.pauxil src/pauxil/$(arch)/Makefile
- $(CP) makes/Make.pfact src/pfact/$(arch)/Makefile
- $(CP) makes/Make.pgesv src/pgesv/$(arch)/Makefile
#
refresh_tst :
- $(CP) makes/Make.matgen testing/matgen/$(arch)/Makefile
- $(CP) makes/Make.timer testing/timer/$(arch)/Makefile
- $(CP) makes/Make.pmatgen testing/pmatgen/$(arch)/Makefile
- $(CP) makes/Make.ptimer testing/ptimer/$(arch)/Makefile
- $(CP) makes/Make.ptest testing/ptest/$(arch)/Makefile
#- SPMS_make_cp`' makes/Make.test testing/test/$(arch)/Makefile
#
## clean ###############################################################
#
clean_src :
- ( $(CD) src/auxil/$(arch); $(MAKE) clean )
- ( $(CD) src/blas/$(arch); $(MAKE) clean )
- ( $(CD) src/comm/$(arch); $(MAKE) clean )
- ( $(CD) src/grid/$(arch); $(MAKE) clean )
- ( $(CD) src/panel/$(arch); $(MAKE) clean )
- ( $(CD) src/pauxil/$(arch); $(MAKE) clean )
- ( $(CD) src/pfact/$(arch); $(MAKE) clean )
- ( $(CD) src/pgesv/$(arch); $(MAKE) clean )
#
clean_tst :
- ( $(CD) testing/matgen/$(arch); $(MAKE) clean )
- ( $(CD) testing/timer/$(arch); $(MAKE) clean )
- ( $(CD) testing/pmatgen/$(arch); $(MAKE) clean )
- ( $(CD) testing/ptimer/$(arch); $(MAKE) clean )
- ( $(CD) testing/ptest/$(arch); $(MAKE) clean )
#- ( SPMS_make_cd`' testing/test/$(arch); SPMS_make_make`' clean )
#
## clean_arch ##########################################################
#
clean_arch_src :
- $(RM) -r src/auxil/$(arch)
- $(RM) -r src/blas/$(arch)
- $(RM) -r src/comm/$(arch)
- $(RM) -r src/grid/$(arch)
- $(RM) -r src/panel/$(arch)
- $(RM) -r src/pauxil/$(arch)
- $(RM) -r src/pfact/$(arch)
- $(RM) -r src/pgesv/$(arch)
#
clean_arch_tst :
- $(RM) -r testing/matgen/$(arch)
- $(RM) -r testing/timer/$(arch)
- $(RM) -r testing/pmatgen/$(arch)
- $(RM) -r testing/ptimer/$(arch)
- $(RM) -r testing/ptest/$(arch)
#- SPMS_make_rm`' -r testing/test/$(arch)
#
## clean_arch_all ######################################################
#
clean_arch_all :
- $(MAKE) -f Make.top clean_arch_src arch=$(arch)
- $(MAKE) -f Make.top clean_arch_tst arch=$(arch)
- $(RM) -r bin/$(arch) include/$(arch) lib/$(arch)
#
## clean_guard #########################################################
#
clean_guard_src :
- ( $(CD) src/auxil/$(arch); $(RM) *.grd )
- ( $(CD) src/blas/$(arch); $(RM) *.grd )
- ( $(CD) src/comm/$(arch); $(RM) *.grd )
- ( $(CD) src/grid/$(arch); $(RM) *.grd )
- ( $(CD) src/panel/$(arch); $(RM) *.grd )
- ( $(CD) src/pauxil/$(arch); $(RM) *.grd )
- ( $(CD) src/pfact/$(arch); $(RM) *.grd )
- ( $(CD) src/pgesv/$(arch); $(RM) *.grd )
#
clean_guard_tst :
- ( $(CD) testing/matgen/$(arch); $(RM) *.grd )
- ( $(CD) testing/timer/$(arch); $(RM) *.grd )
- ( $(CD) testing/pmatgen/$(arch); $(RM) *.grd )
- ( $(CD) testing/ptimer/$(arch); $(RM) *.grd )
- ( $(CD) testing/ptest/$(arch); $(RM) *.grd )
#- ( SPMS_make_cd`' testing/test/$(arch); SPMS_make_rm`' *.grd )
#
## misc ################################################################
#
leaf :
- ( $(CD) $(le) ; $(MKDIR) $(arch) )
- ( $(CD) $(le)/$(arch) ; \
$(LN_S) $(TOPdir)/Make.$(arch) Make.inc )
#
########################################################################