...

Commits (51)
 ... @@ -2,7 +2,7 @@ image: gcc:7 ... @@ -2,7 +2,7 @@ image: gcc:7 before_script: before_script: - apt update - apt update - apt -y install cmake libconfig++-dev libfftw3-dev libnetcdf-dev libcurl4-openssl-dev libopenmpi-dev doxygen openmpi-bin - apt -y install cmake libconfig++-dev libfftw3-dev libnetcdf-dev libcurl4-openssl-dev libopenmpi-dev openmpi-bin build: build: stage: build stage: build ... @@ -13,9 +13,12 @@ build: ... @@ -13,9 +13,12 @@ build: - make VERBOSE=yes - make VERBOSE=yes artifacts: artifacts: paths: paths: - build/src/stemsalabim - build/src/libstemsalabim_lib.so - build/src/libstemsalabim_lib.so - build/tests/stemsalabim_test - build/src/stemsalabim - build/tests/ssb-test - build/src/ssb-mkin - build/src/ssb-chk - build/src/ssb-run expire_in: 2h expire_in: 2h cache: cache: paths: paths: ... ...
 What's new What's new ========== ========== STEMsalabim 5.0.0 ----------------- February 28th, 2019 **IMPORTANT** The parameters application.verbose and simulation.skip_simulation are deprecated now. The groups adf/adf_intensities, cbed/cbed_intensities, and adf/center_of_mass now have a dimension for energy loss. It is usually 1 unless plasmon scattering feature is used. Highlights ^^^^^^^^^^ - Speed improvements by increasing the grid sizes to match efficient FFT sizes. Note, that this may result in a higher simulation grid density than specified in grating.density parameter! - Alternative parallelization scheme, see :ref:parallelization-scheme. When appropriate, different MPI procs now calculate different frozen phonon configurations / defoci in parallel. This reduces the required amount of communication between the processors. - Automatic calculation of center of mass of the CBEDs for all ADF points. The COMs are calculated when adf.enabled = true and stored in the NC file next to adf/adf_intensities in adf/center_of_mass. Unit is mrad. - New executables ssb-mkin and ssb-run. The former prepares an **input** NC file from which the latter can run the simulation. This has multiple advantages. See :ref:simulation-structure for more information. - Single plasmon scattering. Other changes ^^^^^^^^^^^^^ - Removed application.verbose parameter. - Removed simulation.skip_simulation. - Ability to disable thermal displacements via frozen_phonon.enable = false parameter. - Fixed a serious bug with the integrated defocus averaging. - Input XYZ files can now contain more than one space or TAB character for column separation. - Removed Doxygen documentation and doc string comments. - Default FFTW planning is now FFTW_MEASURE. This improves startup times of the simulation slightly. - Changed the chunking of the adf/adf_intensities and cbed/cbed_intensities variables for faster write speed. - Added AMBER/slice_coordinates variable to the output file, that contains the z coordinate of the upper boundary of each slice in nm. - Removed HTTP reporting and CURL dependency. - Significant code refactoring and some minor bugs fixed. - Improved documentation. STEMsalabim 4.0.1, 4.0.2 STEMsalabim 4.0.1, 4.0.2 ------------------------ ------------------------ ... ...
 ... @@ -7,9 +7,9 @@ ... @@ -7,9 +7,9 @@ # version, package name and cmake version # version, package name and cmake version # when you change logic here, don't forget to change the stuff in Sphinx conf.py!! # when you change logic here, don't forget to change the stuff in Sphinx conf.py!! set(PACKAGE_VERSION_MAJOR "4") set(PACKAGE_VERSION_MAJOR "5") set(PACKAGE_VERSION_MINOR "0") set(PACKAGE_VERSION_MINOR "0") set(PACKAGE_VERSION_PATCH "2") set(PACKAGE_VERSION_PATCH "0") set(PACKAGE_NAME "STEMsalabim") set(PACKAGE_NAME "STEMsalabim") set(PACKAGE_DESCRIPTION "A high-performance computing cluster friendly code for scanning transmission electron microscopy image simulations of thin specimens") set(PACKAGE_DESCRIPTION "A high-performance computing cluster friendly code for scanning transmission electron microscopy image simulations of thin specimens") set(PACKAGE_AUTHOR "Jan Oliver Oelerich") set(PACKAGE_AUTHOR "Jan Oliver Oelerich") ... @@ -18,7 +18,9 @@ set(PACKAGE_AUTHOR_EMAIL "[email protected]") ... @@ -18,7 +18,9 @@ set(PACKAGE_AUTHOR_EMAIL "[email protected]") project(STEMsalabim CXX) project(STEMsalabim CXX) cmake_minimum_required(VERSION 3.3) cmake_minimum_required(VERSION 3.3) set(CMAKE_MODULE_PATH ${CMAKE_MODULE_PATH} "${CMAKE_SOURCE_DIR}/cmake") set(CMAKE_MODULE_PATH ${CMAKE_MODULE_PATH} "${CMAKE_SOURCE_DIR}/cmake") set(CMAKE_CXX_STANDARD 11) set(CMAKE_CXX_STANDARD 17) set(CMAKE_CXX_STANDARD_REQUIRED ON) string(TIMESTAMP DATE "%Y-%m-%dT%H:%M:%S") string(TIMESTAMP DATE "%Y-%m-%dT%H:%M:%S") ... @@ -68,19 +70,16 @@ find_package(LibConfig REQUIRED) ... @@ -68,19 +70,16 @@ find_package(LibConfig REQUIRED) include_directories(${LIBCONFIG_INCLUDE_DIR}) include_directories(${LIBCONFIG_INCLUDE_DIR}) set(LIBS ${LIBS}${LIBCONFIG_LIBRARIES}) set(LIBS ${LIBS}${LIBCONFIG_LIBRARIES}) # look for GSL find_package(GSL REQUIRED) include_directories(${GSL_INCLUDE_DIRS}) set(LIBS${LIBS} ${GSL_LIBRARIES}) # look for NetCDF # look for NetCDF find_package(NetCDF REQUIRED) find_package(NetCDF REQUIRED) include_directories(${NETCDF_INCLUDE_DIR}) include_directories(${NETCDF_INCLUDE_DIR}) set(LIBS${LIBS} ${NETCDF_LIBRARIES}) set(LIBS${LIBS} ${NETCDF_LIBRARIES}) # look for CURL find_package(CURL) if(CURL_FOUND) include_directories(${CURL_INCLUDE_DIRS}) set(LIBS ${LIBS}${CURL_LIBRARIES}) set(HAVE_CURL 1) endif(CURL_FOUND) # MPI # MPI find_package(MPI REQUIRED) find_package(MPI REQUIRED) include_directories(${MPI_INCLUDE_PATH}) include_directories(${MPI_INCLUDE_PATH}) ... ...
 # Tries to find Gperftools. # # Usage of this module as follows: # # find_package(Gperftools) # # Variables used by this module, they can change the default behaviour and need # to be set before calling find_package: # # Gperftools_ROOT_DIR Set this variable to the root installation of # Gperftools if the module has problems finding # the proper installation path. # # Variables defined by this module: # # GPERFTOOLS_FOUND System has Gperftools libs/headers # GPERFTOOLS_LIBRARIES The Gperftools libraries (tcmalloc & profiler) # GPERFTOOLS_INCLUDE_DIR The location of Gperftools headers find_library(GPERFTOOLS_TCMALLOC NAMES tcmalloc HINTS ${Gperftools_ROOT_DIR}/lib) find_library(GPERFTOOLS_PROFILER NAMES profiler HINTS${Gperftools_ROOT_DIR}/lib) find_library(GPERFTOOLS_TCMALLOC_AND_PROFILER NAMES tcmalloc_and_profiler HINTS ${Gperftools_ROOT_DIR}/lib) find_path(GPERFTOOLS_INCLUDE_DIR NAMES gperftools/heap-profiler.h HINTS${Gperftools_ROOT_DIR}/include) set(GPERFTOOLS_LIBRARIES ${GPERFTOOLS_TCMALLOC_AND_PROFILER}) include(FindPackageHandleStandardArgs) find_package_handle_standard_args( Gperftools DEFAULT_MSG GPERFTOOLS_LIBRARIES GPERFTOOLS_INCLUDE_DIR) mark_as_advanced( Gperftools_ROOT_DIR GPERFTOOLS_TCMALLOC GPERFTOOLS_PROFILER GPERFTOOLS_TCMALLOC_AND_PROFILER GPERFTOOLS_LIBRARIES GPERFTOOLS_INCLUDE_DIR) \ No newline at end of file  ... @@ -23,7 +23,7 @@ IF( LIBCONFIG_ROOT ) ... @@ -23,7 +23,7 @@ IF( LIBCONFIG_ROOT ) FIND_LIBRARY( FIND_LIBRARY( LIBCONFIG_LIBRARIES LIBCONFIG_LIBRARIES NAMES "config++" NAMES "config++" "libconfig++" PATHS${LIBCONFIG_ROOT} PATHS ${LIBCONFIG_ROOT} PATH_SUFFIXES "lib" "lib64" PATH_SUFFIXES "lib" "lib64" NO_DEFAULT_PATH NO_DEFAULT_PATH ... @@ -41,7 +41,7 @@ ELSE( LIBCONFIG_ROOT ) ... @@ -41,7 +41,7 @@ ELSE( LIBCONFIG_ROOT ) FIND_LIBRARY( FIND_LIBRARY( LIBCONFIG_LIBRARIES LIBCONFIG_LIBRARIES NAMES "config++" NAMES "config++" "libconfig++" PATHS${PKG_LIBCONFIG_LIBRARY_DIRS} ${INCLUDE_INSTALL_DIR} PATHS${PKG_LIBCONFIG_LIBRARY_DIRS} ${INCLUDE_INSTALL_DIR} ) ) ... ...  ... @@ -18,8 +18,6 @@ include(FindPackageHandleStandardArgs) ... @@ -18,8 +18,6 @@ include(FindPackageHandleStandardArgs) if((NOT MKL_ROOT) AND (DEFINED ENV{MKLROOT})) if((NOT MKL_ROOT) AND (DEFINED ENV{MKLROOT})) set(MKL_ROOT$ENV{MKLROOT} CACHE PATH "Folder contains MKL") set(MKL_ROOT $ENV{MKLROOT} CACHE PATH "Folder contains MKL") else() message( FATAL_ERROR "MKL not found! Specify MKL_ROOT!" ) endif() endif() if(${CMAKE_HOST_SYSTEM_PROCESSOR} STREQUAL "x86_64") if(${CMAKE_HOST_SYSTEM_PROCESSOR} STREQUAL "x86_64") ... ...  # add a target to generate API documentation with Doxygen add_custom_target(docs find_package(Doxygen) set(doxyfile_in${CMAKE_CURRENT_SOURCE_DIR}/Doxyfile.in) set(doxyfile ${CMAKE_CURRENT_BINARY_DIR}/Doxyfile) configure_file(${doxyfile_in} ${doxyfile} @ONLY) add_custom_target(docs-source COMMAND${DOXYGEN_EXECUTABLE} ${doxyfile} WORKING_DIRECTORY${CMAKE_CURRENT_BINARY_DIR} COMMENT "Generating API documentation with Doxygen") add_custom_target(docs-manual COMMAND sphinx-build -c ${CMAKE_CURRENT_SOURCE_DIR} -b html${PROJECT_SOURCE_DIR}/docs manual COMMAND sphinx-build -c ${CMAKE_CURRENT_SOURCE_DIR} -b html${PROJECT_SOURCE_DIR}/docs manual WORKING_DIRECTORY ${CMAKE_CURRENT_BINARY_DIR} WORKING_DIRECTORY${CMAKE_CURRENT_BINARY_DIR} COMMENT "Build html documentation" COMMENT "Build html documentation" VERBATIM) add_custom_target(docs-manual-tex COMMAND sphinx-build -c ${CMAKE_CURRENT_SOURCE_DIR} -b latex${PROJECT_SOURCE_DIR}/docs manual-tex WORKING_DIRECTORY ${CMAKE_CURRENT_BINARY_DIR} COMMENT "Build html documentation" VERBATIM) VERBATIM) \ No newline at end of file This source diff could not be displayed because it is too large. You can view the blob instead.  #wrap { width:1170px; margin:0 auto; position:relative; } #titlearea { padding-bottom:20px; } #MSearchBox { top:10px; } .ui-resizable-handle.ui-resizable-e { background-image: none; background-color: #5373B4; cursor: default; width:1px; } #side-nav { padding-right:0px; } #nav-sync { display:none; } #nav-tree, .header { background-image:none !important; } #nav-tree .selected { background-image:none; background-color: #5373B4; text-shadow: none; } .navpath ul { background-image:none; background-color: #F9FAFC; border: none; border-top:1px solid #5373B4; padding: 10px 0 10px 0; } #nav-path li { background-image:none; }  $projectname: $title$title $treeview$search $mathjax$extrastylesheet
$projectname$projectbrief
$projectbrief$searchbox
 File formats File formats ============ ============ A *STEMsalabim* simulation is set-up via **input files** and its results are stored in an **output file**. The file for A STEMsalabim simulation is set-up via **input files** and its results are stored in an **output file**. The file for configuring a simulation is described in detail at :ref:parameter-file. Here, we describe the format of the **crystal configuring a simulation is described in detail at :ref:parameter-file. Here, we describe the format of the **crystal file**, i.e., the atomic information about the specimen, and the **output file**, in which the results are stored. file**, i.e., the atomic information about the specimen, and the **output file**, in which the results are stored. ... @@ -46,42 +46,264 @@ Below is a very brief, artificial example (without custom slicing): :: ... @@ -46,42 +46,264 @@ Below is a very brief, artificial example (without custom slicing): :: Output file format Output file format ------------------ ------------------ All results of a *STEMsalabim* simulation are written to a binary NetCDF _ file. All results of a STEMsalabim simulation are written to a binary NetCDF _ file. The NetCDF format is based on the Hierarchical Data Format _ The NetCDF format is based on the Hierarchical Data Format _ and there are libraries to read the data for many programming languages. and there are libraries to read the data for many programming languages. The structure of NetCDF files is hierarchical and organized in groups. The following groups are written by The structure of NetCDF files can be inspected using the handy tool ncdump -h YOUR_FILE.nc (don't forget the -h *STEMsalabim*: parameter, otherwise the whole content of the file is dumped!). Here is the output of an example run: :: runtime ~~~~~~~ This group contains information about the program and the simulation, such as version, UUID and so on. netcdf out { group: AMBER { dimensions: atom = 164140 ; elements = 1 ; spatial = 3 ; cell_spatial = 3 ; cell_angular = 3 ; label = 6 ; frame = 10 ; slices = 142 ; grid_x = 490 ; grid_y = 490 ; variables: char spatial(spatial) ; char cell_spatial(cell_spatial) ; char cell_angular(cell_angular, label) ; float coordinates(frame, atom, spatial) ; coordinates:unit = "nanometer" ; float lattice_coordinates(frame, atom, spatial) ; float cell_lengths(frame, cell_spatial) ; cell_lengths:unit = "nanometer" ; float cell_angles(frame, cell_angular) ; cell_angles:unit = "degree" ; float radius(frame, atom) ; radius:unit = "nanometer" ; float msd(frame, atom) ; int slice(frame, atom) ; float slice_coordinates(slices) ; short element(frame, atom) ; float system_lengths(cell_spatial) ; float system_angles(cell_spatial) ; char atom_types(elements, label) ; // group attributes: :Conventions = "AMBER" ; :ConventionVersion = "1.0" ; :program = "STEMsalabim" ; :programVersion = "5.0.0b" ; :title = "sim" ; } // group AMBER group: runtime { // group attributes: :programVersionMajor = "5" ; :programVersionMinor = "0" ; :programVersionPatch = "0b" ; :gitCommit = "f1dcc606c9a78b12fc3afda9496f638992b591bf" ; :title = "sim" ; :UUID = "8dce768e-f1d6-4876-bb20-c301e3e323f8" ; :time_start = "2019-02-12 13:25:43" ; :time_stop = "2019-02-13 00:06:05" ; } // group runtime group: params { dimensions: defocus = 1 ; plasmon_energies = 51 ; variables: float defocus(defocus) ; float defocus_weights(defocus) ; float plasmon_energies(plasmon_energies) ; // group attributes: :program_arguments = "--params=inp.cfg --num-threads=4 --tmp-dir=/local --output-file=out.nc" ; :config_file_contents = "..." ; group: application { // group attributes: :random_seed = 967613772U ; } // group application group: simulation { // group attributes: :title = "sim" ; :normalize_always = 0US ; :bandwidth_limiting = 1US ; :output_file = "out.nc" ; :output_compress = 0US ; } // group simulation group: probe { // group attributes: :c5 = 5000000. ; :cs = 2000. ; :astigmatism_ca = 0. ; :defocus = -0. ; :fwhm_defoci = 6. ; :num_defoci = 1U ; :astigmatism_angle = 0. ; :min_apert = 0. ; :max_apert = 15.07 ; :beam_energy = 200. ; :scan_density = 40. ; } // group probe group: specimen { // group attributes: :max_potential_radius = 0.3 ; :crystal_file = "Si_110_10x10x200_300K.xyz" ; } // group specimen group: grating { // group attributes: :density = 90. ; :nx = 490U ; :ny = 490U ; :slice_thickness = 0.76806 ; } // group grating group: adf { // group attributes: :enabled = 1US ; :x = 0.5, 0.6 ; :y = 0.5, 0.6 ; :detector_min_angle = 0. ; :detector_max_angle = 150. ; :detector_num_angles = 151U ; :detector_interval_exponent = 1.f ; :average_configurations = 1US ; :average_defoci = 1US ; :save_slices_every = 10U ; } // group adf group: cbed { // group attributes: :enabled = 1US ; :x = 0.5, 0.6 ; :y = 0.5, 0.6 ; :size = 0U, 0U ; :average_configurations = 1US ; :average_defoci = 0US ; :save_slices_every = 101U ; } // group cbed group: frozen_phonon { // group attributes: :number_configurations = 10U ; :fixed_slicing = 1US ; :enabled = 1US ; } // group frozen_phonon group: plasmon_scattering { // group attributes: :enabled = 1US ; :simple_mode = 0US ; :plural_scattering = 0US ; :max_energy = 25.f ; :energy_grid_density = 2.f ; :mean_free_path = 128.f ; :plasmon_energy = 16.9f ; :plasmon_fwhm = 4.f ; } // group plasmon_scattering } // group params group: adf { dimensions: adf_position_x = 22 ; adf_position_y = 22 ; adf_detector_angle = 151 ; adf_defocus = 1 ; adf_phonon = 1 ; adf_slice = 15 ; coordinate_dim = 2 ; adf_plasmon_energies = 51 ; variables: float adf_intensities(adf_defocus, adf_position_x, adf_position_y, adf_phonon, adf_slice, adf_plasmon_energies, adf_detector_angle) ; float center_of_mass(adf_defocus, adf_position_x, adf_position_y, adf_phonon, adf_slice, adf_plasmon_energies, coordinate_dim) ; double adf_probe_x_grid(adf_position_x) ; double adf_probe_y_grid(adf_position_y) ; double adf_detector_grid(adf_detector_angle) ; double adf_slice_coords(adf_slice) ; } // group adf group: cbed { dimensions: cbed_position_x = 22 ; cbed_position_y = 22 ; cbed_k_x = 327 ; cbed_k_y = 327 ; cbed_defocus = 1 ; cbed_phonon = 1 ; cbed_slice = 2 ; cbed_plasmon_energies = 51 ; variables: float cbed_intensities(cbed_defocus, cbed_position_x, cbed_position_y, cbed_phonon, cbed_slice, cbed_plasmon_energies, cbed_k_x, cbed_k_y) ; double cbed_probe_x_grid(cbed_position_x) ; double cbed_probe_y_grid(cbed_position_y) ; double cbed_x_grid(cbed_k_x) ; double cbed_y_grid(cbed_k_y) ; double cbed_slice_coords(cbed_slice) ; } // group cbed } The structure of NetCDF files is hierarchical and organized in groups. The following groups are written by STEMsalabim: AMBER AMBER ~~~~~ ~~~~~ This group contains the atomic coordinates, species, displacements, radii, etc. for the complete crystal for each single This group contains the atomic coordinates, species, displacements, radii, etc. for the complete crystal for each single calculated frozen lattice configuration, as well as for each calculated defocus value. The AMBER group content is calculated frozen lattice configuration, as well as for each calculated defocus value. The AMBER group content is compatible with the AMBER specifications _. A *STEMsalabim* NetCDF file can compatible with the AMBER specifications _. A STEMsalabim NetCDF file can be opened seamlessly with the Ovito _ crystal viewer. be opened seamlessly with the Ovito _ crystal viewer. .. csv-table:: :file: table_nc_amber.csv runtime ~~~~~~~ .. csv-table:: :file: table_nc_runtime.csv params params ~~~~~~ ~~~~~~ All simulation parameters are collected in the params group as attributes. .. note:: The params group contains subgroups with attributes that correspond exactly to the simulation parameters as written, except - /params/application/random_seed is set to the generated random seed - /params/grating/nx and /params/grating/ny contain the simulation grid size used. .. csv-table:: :file: table_nc_params.csv adf adf ~~~ ~~~ This group contains the simulated ADF intensities, the coordinates of the electron probe beam during scanning, the .. csv-table:: detector angle grid that is used, and coordinates of the slices as used in the multi-slice algorithm. :file: table_nc_adf.csv cbed cbed ~~~~ ~~~~ This group contains the simulated CBED intensities, the coordinates of the electron probe beam during scanning, k-space .. csv-table:: grid, and coordinates of the slices as used in the multi-slice algorithm. :file: table_nc_cbed.csv Reading NC Files Reading NC Files ---------------- ---------------- ... ...
 ... @@ -7,11 +7,11 @@ General information ... @@ -7,11 +7,11 @@ General information Throughout this documentation we assume that you are familiar with the theoretical background behind the scanning Throughout this documentation we assume that you are familiar with the theoretical background behind the scanning transmission electron microscope (STEM) to some degree. Also, we assume that you have some knowledge about the transmission electron microscope (STEM) to some degree. Also, we assume that you have some knowledge about the UNIX/Linux command line and parallelized computation. *STEMsalabim* is currently not intended to be run on a desktop UNIX/Linux command line and parallelized computation. STEMsalabim is currently not intended to be run on a desktop computer. While that is possible and works, the main purpose of the program is to be used in a highly parallelized computer. While that is possible and works, the main purpose of the program is to be used in a highly parallelized multi-computer environment. multi-computer environment. We took great care of making *STEMsalabim* easy to install. You can find instructions at :ref:installing. However, if We took great care of making STEMsalabim easy to install. You can find instructions at :ref:installing. However, if you run into technical problems you should seek help from an administrator of your computer cluster first. you run into technical problems you should seek help from an administrator of your computer cluster first. .. _simulation-structure: .. _simulation-structure: ... @@ -19,7 +19,7 @@ you run into technical problems you should seek help from an administrator of yo ... @@ -19,7 +19,7 @@ you run into technical problems you should seek help from an administrator of yo Structure of a simulation Structure of a simulation ------------------------- ------------------------- There essence of *STEMsalabim* is to model the interaction of a focused electron beam with a bunch of atoms, typically There essence of STEMsalabim is to model the interaction of a focused electron beam with a bunch of atoms, typically in the form of a crystalline sample. Given the necessary input files, the simulation crunches numbers for some time, in the form of a crystalline sample. Given the necessary input files, the simulation crunches numbers for some time, after which all of the calculated results can be found in the output file. Please refer to :ref:running for notes after which all of the calculated results can be found in the output file. Please refer to :ref:running for notes how to start a simulation. how to start a simulation. ... @@ -28,7 +28,7 @@ Input files ... @@ -28,7 +28,7 @@ Input files ~~~~~~~~~~~ ~~~~~~~~~~~ All information about the specimen are listed in the :ref:crystal-file, which is one of the two required input files All information about the specimen are listed in the :ref:crystal-file, which is one of the two required input files for *STEMsalabim*. It contains each atom's species (element), coordinates, and mean square displacement for STEMsalabim. It contains each atom's species (element), coordinates, and mean square displacement _ as it appears in the Debye-Waller factors _ as it appears in the Debye-Waller factors _. _. ... @@ -39,7 +39,7 @@ microscope, detector, and all required simulation parameters. All these paramete ... @@ -39,7 +39,7 @@ microscope, detector, and all required simulation parameters. All these paramete Output files Output files ~~~~~~~~~~~~ ~~~~~~~~~~~~ The complete output of a *STEMsalabim* simulation is written to a NetCDF The complete output of a STEMsalabim simulation is written to a NetCDF _ file. NetCDF is a binary, hierarchical file format for scientific _ file. NetCDF is a binary, hierarchical file format for scientific data, based on HDF5 _. NetCDF/HDF5 allow us to compress the output data and store data, based on HDF5 _. NetCDF/HDF5 allow us to compress the output data and store it in machine-readable, organized format while still only having to deal with a single output file. it in machine-readable, organized format while still only having to deal with a single output file. ... @@ -51,7 +51,7 @@ You can read more about the output file structure at :ref:output-file. ... @@ -51,7 +51,7 @@ You can read more about the output file structure at :ref:output-file. Hybrid Parallelization model Hybrid Parallelization model ---------------------------- ---------------------------- *STEMsalabim* simulations can be parallelized both via POSIX threads _ STEMsalabim simulations is parallelized both via POSIX threads _ and via message passing interface (MPI) _. A typical and via message passing interface (MPI) _. A typical simulation will use both schemes at the same time: MPI is used for communication between the computing nodes, and simulation will use both schemes at the same time: MPI is used for communication between the computing nodes, and threads are used for intra-node parallelization, the usual multi-cpu/multi-core structure. threads are used for intra-node parallelization, the usual multi-cpu/multi-core structure. ... @@ -64,22 +64,31 @@ threads are used for intra-node parallelization, the usual multi-cpu/multi-core ... @@ -64,22 +64,31 @@ threads are used for intra-node parallelization, the usual multi-cpu/multi-core Let us assume a simulation that runs on :math:M computers and each of them spawns :math:N threads. Let us assume a simulation that runs on :math:M computers and each of them spawns :math:N threads. There is a single, special *master thread* (the thread 0 of the MPI process with rank 0) that orchestrates the simulation, Depending on the simulation parameters chosen, STEMsalabim may need to loop through multiple frozen phonon configurations i.e., manages and distributes work packages. All other threads (:math:(M\times N)-1) participate in the simulation. In and values of the probe defocus. The same simulation (with differently displaced atoms and different probe defocus) is MPI mode, each MPI process writes results to its own temporary file, and after each frozen lattice configuration the therefore typically run multiple times. There are three parallelization schemes implemented in STEMsalabim: results are merged. Merging is carried out sequentially by each individual MPI processor, so that no race condition is ran into. The parameter :code:output.tmp_dir (see :ref:parameter-file) should be set to a directory that is local - When :math:M == 1, i.e., no MPI parallelization is used, all pixels (probe positions) are distributed among the :math:N threads and calculated in parallel. - Each MPI processor calculates *all* pixels (probe positions) of its own frozen phonon / defocus configuration, i.e., :math:M configurations are calculated in parallel. Each of the :math:M calculations splits its pixels between :math:N threads (each thread calculates one pixel at a time). This scheme makes sense when the total number of configurations (probe.num_defoci :math:\times frozen_phonon.number_configurations) is much larger than or divisible by :math:M. - A single configuration is calculated at a time, and all the pixels are split between all :math:M \times N threads. In order to reduce the required MPI communication around, only the main thread of each of the :math:M MPI processors communicates with the master thread. The master thread sends a *work package* containing some number of probe pixels to be calculated to an MPI process, which then carries out all the calculations in parallel on its :math:N threads. When a work package is finished, it requests another work package from the master MPI process until there is no work left. In parallel, the worker threads of the MPI process with rank 0 also work on emptying the work queue. In MPI mode, each MPI process writes results to its own temporary file, and after each frozen lattice configuration the results are merged. Merging is carried out sequentially by each individual MPI processor, to avoid race conditions. The parameter :code:output.tmp_dir (see :ref:parameter-file) should be set to a directory that is local to each MPI processor (e.g., :code:/tmp). to each MPI processor (e.g., :code:/tmp). A typical *STEMsalabim* simulation is composed of many independent multi-slice simulations that differ only in the position of the scanning probe. Hence, parallelization is done on the level of these multi-slice simulations, with each thread performing them independently from other threads. In order to reduce the number of MPI messages being sent around, only the main thread of each of the :math:M MPI processors communicates with the master thread. The master thread sends a *work package* containing some number of probe pixels to be calculated to an MPI process, which then carries out all the calculations in parallel on its :math:N threads. When a work package is finished, it requests another work package from the master MPI process until there is no work left. In parallel, the worker threads of the MPI process with rank 0 also work on emptying the work queue. .. note:: Within one MPI processor, the threads can share their memory. As the main memory consumption comes from storing .. note:: Within one MPI processor, the threads can share their memory. As the main memory consumption comes from storing the weak phase objects of the slices in the multi-slice simulation, which don't change during the actual simulation, the weak phase objects of the slices in the multi-slice simulation, which don't change during the actual simulation, this greatly reduces memory usage as compared to MPI only parallelization. You should therefore always aim for this greatly reduces memory usage as compared to MPI only parallelization. You should therefore always aim for ... ...
 .. _reporting: HTTP Status reporting ===================== *STEMsalabim* simulations may take a long time, even when running them in parallel on many processors. In order to ease tracking of the status of running simulations, we built reporting via HTTP POST requests into the program. In order to use that feature, the libCURL _ library has to be installed and *STEMsalabim* needs to be linked against it. .. _configure-reporting: Configure HTTP reporting ------------------------ To configure HTTP reporting, please add the http_reporting: {} block to your simulations's parameter file, containing at least reporting = true; and the url to report to, url = "http://my_server_address:port/path";. If you want to use HTTP basic authentication _, you may also specify the options auth_user = "your_user"; and auth_pass = "your_pass";. Note, that HTTP basic auth will be enabled as soon as auth_user is not empty. You should therefore only fill in that field when you want to use authentication. Additional, custom payload for the HTTP requests may be specified in the sub-block parameters: {}. Each key-value pair in this block is translated into JSON and appended to each request. This allows you to use custom authentication techniques, such as token-based authentication etc. An example configuration block with HTTP basic authentication may look like this: :: http_reporting: { reporting = true; url = "http://my_api_endpoint:8000/stemsalabim-reporting"; auth_user = "my_user"; auth_pass = "my_pass"; parameters: { simulation_category = "suitable for many Nature papers"; } } The status requests ------------------- In each request that *STEMsalabim* sends, some JSON payload is common. In addition to the JSON values specified in the parameter file (see :ref:configure-reporting), the following parameters are always reported: :: time: // The currrent date and time id: // the UUID of the simulation num_threads: // the number of threads of each MPI processor num_processors: // the number of MPI processors num_defoci: // the total number of defoci to calculate num_configurations: // the total number of frozen phonon configurations to calculate event: // A code for what event is reported. see below. The following four different event codes, each for a different event, are reported: START_SIMULATION: A simulation is started ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ This request is sent at the beginning of a simulation. Additional key/value pairs sent are: :: event: "START_SIMULATION" version: // program version git_commit: // git commit hash of the program version title: // simulation title START_DEFOCUS: A defocus iteration is started ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ This request is sent at the beginning of a defocus iteration. Additional key/value pairs sent are: :: event: "START_DEFOCUS" defocus: // the defocus value in nm defocus_index: // the index of the defocus, between 0 and num_defoci START_FP_CONFIGURATION: A frozen phonon iteration is started ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ This request is sent at the beginning of a frozen phonon configuration. Additional key/value pairs sent are: :: event: "START_FP_CONFIGURATION" defocus: // the defocus value in nm defocus_index: // the index of the defocus, between 0 and num_defoci configuration_index: // the index of the configuration, between 0 and num_configurations PROGRESS: Progress report ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ This request is sent during the calculation, typically after each integer percent of the simulation finished. Additional key/value pairs sent are: :: event: "START_CONFIGURATION" defocus: // the defocus value in nm defocus_index: // the index of the defocus, between 0 and num_defoci configuration_index: // the index of the configuration, between 0 and num_configurations progress: // progress between 0 and 1 of this configuration iteration within this defocus iteration FINISH: Simulation finished ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ This request is sent when the simulation finished. Additional key/value pairs sent are: :: event: "FINISH" How to process the reports -------------------------- Obviously, in order to register the requests, an HTTP(S) server needs to be running on the target machine. For example, a very simple server in python using the http://flask.pocoo.org/ __ package, that only echos the requests, can be implemented as: :: #!/usr/bin/env python from flask import Flask from flask import request import json app = Flask(__name__) @app.route('/', methods=['POST']) def echo(): content = request.get_json() print(json.dumps(content, indent=4)) return "" if __name__ == "__main__": app.run() Run the script and then start a *STEMsalabim* simulation to see requests imcoming.
 STEMsalabim STEMsalabim =========== =========== The *STEMsalabim* software aims to provide accurate scanning transmission electron microscopy (STEM) image simulation of The STEMsalabim software aims to provide accurate scanning transmission electron microscopy (STEM) image simulation of a specimen whose atomic structure is known. It implements the frozen lattice multi-slice algorithm as described in a specimen whose atomic structure is known. It implements the frozen lattice multi-slice algorithm as described in great detail in the book Advanced computing in electron microscopy _ by great detail in the book Advanced computing in electron microscopy _ by Earl J. Kirkland. Earl J. Kirkland. ... @@ -10,7 +10,7 @@ While there are multiple existing implementations of the same technique, at the ... @@ -10,7 +10,7 @@ While there are multiple existing implementations of the same technique, at the suitable for leveraging massive parallelization available on high-performance computing (HPC) clusters, making it suitable for leveraging massive parallelization available on high-performance computing (HPC) clusters, making it possible to simulate large supercells and parameter sweeps in reasonable time. possible to simulate large supercells and parameter sweeps in reasonable time. The purpose of *STEMsalabim* is to fill this gap by providing a multi-slice implementation that is well parallelizable The purpose of STEMsalabim is to fill this gap by providing a multi-slice implementation that is well parallelizable both within and across computing nodes, using a mixture of threaded parallelization and message passing interface (MPI). both within and across computing nodes, using a mixture of threaded parallelization and message passing interface (MPI). ... @@ -18,10 +18,13 @@ both within and across computing nodes, using a mixture of threaded parallelizat ... @@ -18,10 +18,13 @@ both within and across computing nodes, using a mixture of threaded parallelizat :maxdepth: 2 :maxdepth: 2 :caption: Getting Started :caption: Getting Started what install install usage usage visualization visualization bla .. toctree:: .. toctree:: :maxdepth: 2 :maxdepth: 2 :caption: More information :caption: More information ... @@ -29,7 +32,6 @@ both within and across computing nodes, using a mixture of threaded parallelizat ... @@ -29,7 +32,6 @@ both within and across computing nodes, using a mixture of threaded parallelizat general general parameters parameters file_formats file_formats http_reporting faq faq whats_new whats_new citing citing ... @@ -39,7 +41,7 @@ both within and across computing nodes, using a mixture of threaded parallelizat ... @@ -39,7 +41,7 @@ both within and across computing nodes, using a mixture of threaded parallelizat Contact us! Contact us! =========== =========== *STEMsalabim* is a relatively young software package and was not heavily tested outside the scope of our group. STEMsalabim is a relatively young software package and was not heavily tested outside the scope of our group. We are glad to help you getting your simulations to run. We are glad to help you getting your simulations to run. Please contact **strl-stemsalabim [at] lists.uni-marburg.de** for support or feedback. Please contact **strl-stemsalabim [at] lists.uni-marburg.de** for support or feedback. ... @@ -47,13 +49,13 @@ Please contact **strl-stemsalabim [at] lists.uni-marburg.de** for support or fee ... @@ -47,13 +49,13 @@ Please contact **strl-stemsalabim [at] lists.uni-marburg.de** for support or fee Credits Credits ======= ======= * We acknowledge the creators of the supplementary libraries that *STEMsalabim* depends on. * We acknowledge the creators of the supplementary libraries that STEMsalabim depends on. * We would also like to acknowledge the creators of STEMsim _, * We would also like to acknowledge the creators of STEMsim _, which we used as a reference implementation to test *STEMsalabim*. which we used as a reference implementation to test STEMsalabim. * Once again, we would like to highlight the book * Once again, we would like to highlight the book Advanced computing in electron microscopy _ by Earl J. Kirkland for Advanced computing in electron microscopy _ by Earl J. Kirkland for its detailed description of the implementation of multi-slice algorithms. its detailed description of the implementation of multi-slice algorithms. * *STEMsalabim* was written in the Structure & Technology Research Laboratory _ * STEMsalabim was written in the Structure & Technology Research Laboratory _ of the Philipps-Universität Marburg _ with financial support by of the Philipps-Universität Marburg _ with financial support by the German Research Foundation _ the German Research Foundation _ ... ...
 ... @@ -3,37 +3,20 @@ ... @@ -3,37 +3,20 @@ Installing STEMsalabim Installing STEMsalabim ====================== ====================== Downloading the source code --------------------------- We recommend you download the latest stable release (|release|) from the Releases page _. If you want the latest features and/or bugfixes, you can also clone the repository using :: $git clone https://gitlab.com/STRL/STEMsalabim.git$ git checkout devel # only if you want the devel code. Requirements Requirements ------------ ------------ The following libraries and tools are needed to successfully compile the code: The following libraries and tools are needed to successfully compile the code: * A C++11 compiler (such as gcc/g++ _ or intel mkl _). * A C++11 compiler (such as gcc/g++ _ or intel compiler suite _). * CMake _ > 3.3 * CMake _ > 3.3 * NetCDF _ * NetCDF _ * libConfig _ >= 1.5 * libConfig _ >= 1.5 * FFTW3 _ * FFTW3 _ or Intel's MKL _ * An MPI implementation (such as OpenMPI _) * An MPI implementation (such as OpenMPI _) The following libraries are *optional* and are needed only to enable additional features: * libCurl _ (required for HTTP POST status announcements) .. note:: You may find some of the requirements in the repositories of your Linux distribution, at least the compiler, .. note:: You may find some of the requirements in the repositories of your Linux distribution, at least the compiler, CMake, libCurl and OpenMPI. On Debian or Ubuntu Linux, for example, you can simply run the following command CMake, and OpenMPI. On Debian or Ubuntu Linux, for example, you can simply run the following command to download and install all the requirements: to download and install all the requirements: :: :: ... @@ -42,15 +25,26 @@ The following libraries are *optional* and are needed only to enable additional ... @@ -42,15 +25,26 @@ The following libraries are *optional* and are needed only to enable additional libconfig++-dev \ libconfig++-dev \ libfftw3-dev \ libfftw3-dev \ libnetcdf-dev \ libnetcdf-dev \ libcurl4-openssl-dev \ doxygen \ libopenmpi-dev \ libopenmpi-dev \ openmpi-bin openmpi-bin .. Tip:: As the main work of the STEM image simulations is carried out by the FFTW3 _ .. Tip:: Most of the computing time is spent calculating Fourier transforms, so it is beneficial for STEMsalabim library, you may reach best performance when you compile the library yourself with all available CPU level to use optimized FFT libraries. Sometimes, compiling FFTW or MKL on the target machine enables optimizations enabled. optimizations that are not available in precompiled binaries, so this may be worth a try. Downloading the source code --------------------------- We recommend you download the latest stable release (|release|) from the Releases page _. If you want the latest features and/or bugfixes, you can also clone the repository using :: $git clone https://gitlab.com/STRL/STEMsalabim.git$ git checkout devel # only if you want the devel code. Building STEMsalabim Building STEMsalabim ... @@ -100,17 +94,17 @@ You are now ready to execute your first simulation. ... @@ -100,17 +94,17 @@ You are now ready to execute your first simulation. Building with Intel MKL, Intel compiler (and Intel MPI) Building with Intel MKL, Intel compiler (and Intel MPI) ------------------------------------------------------- ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ It is possible to use the Intel® Parallel Studio _ It is possible to use the Intel® Parallel Studio _ for compilation, which includes the Intel® Math Kernel Library (MKL) _ for compilation, which includes the Intel® Math Kernel Library (MKL) _ that *STEMsalabim* can use for discrete fourier transforms instead of FFTW3. If the that STEMsalabim can use for discrete fourier transforms instead of FFTW3. If the Intel® MPI Library _ is also available, it can be used Intel® MPI Library _ is also available, it can be used as the MPI implementation in *STEMsalabim*. for MPI communication. .. note:: We have tested compiling and running *STEMsalabim* only with Parallel Studio 2017 so far. .. note:: We have tested compiling and running STEMsalabim only with Parallel Studio 2017 so far. *STEMsalabim*'s CMake files try to find the necessary libraries themselves, when the folling conditions are true: STEMsalabim's CMake files try to find the necessary libraries themselves, when the folling conditions are true: 1. Either the environment variable :code:MKLROOT is set to a valid install location of the MKL, or 1. Either the environment variable :code:MKLROOT is set to a valid install location of the MKL, or the CMake variable :code:MKL_ROOT (pointing at the same location) is specified. the CMake variable :code:MKL_ROOT (pointing at the same location) is specified. ... @@ -123,10 +117,14 @@ For example, let's say the Intel suite is installed in :code:/opt/intel and we ... @@ -123,10 +117,14 @@ For example, let's say the Intel suite is installed in :code:/opt/intel and we $export PATH=$PATH:/opt/intel/... # mpicxx and icpc should be in the path! $export PATH=$PATH:/opt/intel/... # mpicxx and icpc should be in the path! $LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/opt/gcc-6.3/lib64 \ $LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/opt/gcc-6.3/lib64 \ cmake ../source -DMKL_ROOT=/opt/intel -DCMAKE_CXX_COMPILER=icpc -DGCCDIR=/opt/gcc-6.3 cmake ../source \ -DMKL_ROOT=/opt/intel \ -DCMAKE_CXX_COMPILER=icpc \ -DGCCDIR=/opt/gcc-6.3 \ -D... more CMAKE arguments as described above. Depending on how your environment variables are set, you may be able to skip the :code:LD_LIBRARY_PATH=.. part. Depending on how your environment variables are set, you may be able to skip the :code:LD_LIBRARY_PATH=.. part. When *STEMsalabim* is executed, you may again need to specify the library path of the :code:libstdc++, using :: When STEMsalabim is executed, you may again need to specify the library path of the :code:libstdc++, using :: $LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/opt/gcc-6.3/lib64 mpirun -np ... /path/to/stemsalabim -p ... $LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/opt/gcc-6.3/lib64 mpirun -np ... /path/to/stemsalabim -p ... ... ...
This diff is collapsed.
 **Dimensions**, adf_position_x,Number of probe positions in x direction adf_position_y,Number of probe positions in y direction adf_detector_angle,Number of stored detector angle bins adf_defocus,Number of stored defoci (1 when averaged over defoci) adf_phonon,Number of stored frozen phonon configuration (1 when averaged over configurations) adf_slice,Number of stored slices coordinate_dim,"x,y coordinate dimension (2)" **Variables**, adf_intensities,ADF intensities of each pixel [fraction of beam] center_of_mass,Center of mass of each pixel [mrad] adf_probe_x_grid,Position vector of the probe in x direction [nm] adf_probe_y_grid,Position vector of the probe in y direction [nm] adf_detector_grid,Lower angles of the detector bins [mrad] adf_slice_coords,Coordinates of the stored slices [nm]
 **Attributes**, Conventions,"String ""AMBER"" (required for AMBER)" ConventionVersion,Version of the AMBER spec. program,"Program name (""STEMsalabim"")" programVersion,STEMsalabim's version title,Simulation title (Param simulation.title) **Dimensions**, atom,Number of atoms elements,Number of different elements spatial,Number of spatial dimensions (3) cell_spatial,Number of spatial dimensions (3) cell_angular,Number of spatial dimensions (3) label,Character String for element names (6) frame,Number of frozen phonon configurations * number of defoci slices,Number of slices in the multi-slice approximation grid_x,Number of simulation grid points in x direction grid_y,Number of simulation grid points in y direction **Variables**, spatial,"Names of the spatial dimensions (""x,y,z"")" cell_spatial,"Names of the spatial cell parameters (""a,b,c"")" cell_angular,"Names of the cell angles (""alpha,beta,gamma"")" coordinates,Coordinates of the atoms [nm] lattice_coordinates,Equilibrium coordinates of the atoms (i.e. lattice positions without displacements) [nm] cell_lengths,Cell lengths (Same for each frame) [nm] cell_angles,"Cell angles (Same for each frame, always ""90, 90, 90"") [degree]" radius,Radii of each atom [nm] msd,Mean square displacement of each atom [nm^2] slice,Slice id of each atom slice_coordinates,z-Coordinate of each slice [nm] element,Element id of each atom (see atom_types) system_lengths,Cell lengths [nm] system_angles,Cell angles [degree] atom_types,Description of atom types
 **Dimensions**, cbed_position_x,Number of probe positions in x direction cbed_position_y,Number of probe positions in y direction cbed_k_x,Number of k grid in k_x direction cbed_k_y,Number of k grid in k_y direction cbed_defocus,Number of stored defoci (1 when averaged over defoci) cbed_phonon,Number of stored frozen phonon configuration (1 when averaged over configurations) cbed_slice,Number of stored slices **Variables**, cbed_intensities,cbed intensities of each pixel [fraction of beam] cbed_probe_x_grid,Position vector of the probe in x direction [nm] cbed_probe_y_grid,Position vector of the probe in y direction [nm] cbed_x_grid,Angles of k_x grid [mrad] cbed_y_grid,Angles of k_y grid [mrad] cbed_slice_coords,Coordinates of the stored slices [nm]
 **Attributes**, program_arguments,CLI arguments of this run config_file_contents,Contents of the parameter file of this run. **Dimensions**, defocus,Number of defocus (param probe.num_defoci) **Variables**, defocus,The values of the defoci of the defocus series defocus_weights,The weights for defocus averaging corresponding to each defocus
 **Attributes**, programVersionMajor,Major version of STEMsalabim programVersionMinor,Minor version of STEMsalabim programVersionPatch,Patch version of STEMsalabim gitCommit,Commit hash of the git commit of STEMsalabim title,Simulation title (Param simulation.title) UUID,Automatically generated universally unique identifier of this run. time_start,Start time of the simulation run time_stop,Finish time of the simulation run
 .. _running: .. _running: Executing STEMsalabim Running STEMsalabim ===================== =================== *STEMsalabim* is executed on the command line and configured via input configuration files in libConfig syntax STEMsalabim is executed on the command line and configured via input configuration files in libConfig syntax __. To learn about the structure of the _. To learn about the structure of the configuration files, please read :ref:parameter-file. configuration files, please read :ref:parameter-file. .. note:: Some of configuration parameters can be changed via command line parameters, which are described in .. note:: Some of configuration parameters can be changed via command line parameters, which are described in :ref:cli-parameters. :ref:cli-parameters. *STEMsalabim* supports both threaded (shared memory) and MPI (distributed memory) parallelization. For most efficient STEMsalabim supports both threaded (shared memory) and MPI (distributed memory) parallelization. For most efficient resource usage we recommend a hybrid approach, where one MPI task is run per node that spawns a bunch of threads to resource usage we recommend a hybrid approach, where one MPI task is run per node that spawns a bunch of threads to parallelize the work within the node. (See :ref:parallelization-scheme for more information on how *STEMsalabim* parallelize the work within the node. (See :ref:parallelization-scheme for more information on how STEMsalabim is parallelized.) can be parallelized.) Parallel runs ------------- Thread-only parallelization Thread-only parallelization --------------------------- ^^^^^^^^^^^^^^^^^^^^^^^^^^^ You can execute *STEMsalabim* on a single multi-core computer as follows: You can execute STEMsalabim on a single multi-core computer as follows: :: :: $stemsalabim --params=./my_config_file.cfg --num-threads=32$ stemsalabim --params=./my_config_file.cfg --num-threads=32 ... @@ -26,9 +29,9 @@ You can execute *STEMsalabim* on a single multi-core computer as follows: ... @@ -26,9 +29,9 @@ You can execute *STEMsalabim* on a single multi-core computer as follows: This will run the simulation configured in my_config_file.cfg on 32 cores, of which 31 are used as workers. This will run the simulation configured in my_config_file.cfg on 32 cores, of which 31 are used as workers. MPI only parallelization MPI only parallelization ------------------------ ^^^^^^^^^^^^^^^^^^^^^^^^ For pure MPI parallelization without spawning additional threads, *STEMsalabim* must be called via mpirun or For pure MPI parallelization without spawning additional threads, STEMsalabim must be called via mpirun or mpiexec, depending on the MPI implementation available on your machine: mpiexec, depending on the MPI implementation available on your machine: :: :: ... @@ -42,12 +45,12 @@ This command will run the simulation in parallel on 32 MPI processors without sp ... @@ -42,12 +45,12 @@ This command will run the simulation in parallel on 32 MPI processors without sp reduces management overhead but increases the amount of data sent via the network. reduces management overhead but increases the amount of data sent via the network. Hybrid parallelization Hybrid parallelization ---------------------- ^^^^^^^^^^^^^^^^^^^^^^ Hybrid parallelization is the recommended mode to run *STEMsalabim*. Hybrid parallelization is the recommended mode to run STEMsalabim. For hybrid parallelization, make sure that on each node only a single MPI process is spawned and that there is no CPU For hybrid parallelization, make sure that on each node only a single MPI process is spawned and that there is no CPU pinning active, i.e., *STEMsalabim* needs to be able to spawn threads on different cores. pinning active, i.e., STEMsalabim needs to be able to spawn threads on different cores. For example, if we wanted to run a simulation in parallel on 32 machines using OpenMPI and on each machine use 16 cores, For example, if we wanted to run a simulation in parallel on 32 machines using OpenMPI and on each machine use 16 cores, we would run we would run ... @@ -61,20 +64,21 @@ we would run ... @@ -61,20 +64,21 @@ we would run --package-size=160 --package-size=160 The options --bind-to none --map-by ppr:1:node:pe=16 tell OpenMPI not to bind the process to anything and to reserve The options --bind-to none --map-by ppr:1:node:pe=16 tell OpenMPI not to bind the process to anything and to reserve 16 threads for each instance. Please refer to the manual of your MPI implementation to figure out how exactly to run the 16 threads for each instance. Please refer to the manual of your MPI implementation to figure out how start a hybrid software. On HPC clusters it is wise to contact the admin team for optimizing the simulation performance. parallelization run. On computing clusters, node and/or socket topology may affect performance, so it is wise to consult your cluster admin team. .. _Si_001: .. _Si_001: Running the :code:Si 001 example :code:Si 001 example ---------------------------------- ---------------------- In the source code archive you find an :code:examples/Si_001 folder that contains a simple example that you can In the source code archive you find an :code:examples/Si_001 folder that contains a simple example that you can execute to get started. The file :code:Si_001.xyz describes a 2x2x36 unit cell Si sample. Please see execute to get started. The file :code:Si_001.xyz describes a 2x2x36 unit cell Si sample. Please see :ref:crystal-file for the format description. :ref:crystal-file for the format description. In the file :code:Si_001.cfg we find the simulation configuration / parameters. The file contains In the file :code:Si_001.cfg we find the simulation configuration / parameters. The file contains all available parameters, regardless of whether they have their default value. We recommend to always all available parameters, regardless of whether they are set to their default value. We recommend to always specify a complete set of simulation parameters in the configuration files. specify a complete set of simulation parameters in the configuration files. You can now run the simulation: You can now run the simulation: ... @@ -84,4 +88,49 @@ You can now run the simulation: ... @@ -84,4 +88,49 @@ You can now run the simulation: $/path/to/stemsalabim --params Si_001.cfg --num-threads=8$ /path/to/stemsalabim --params Si_001.cfg --num-threads=8 After the simulation finished (about 3 hours on an Intel i7 CPU with 8 cores) you can analyze the After the simulation finished (about 3 hours on an Intel i7 CPU with 8 cores) you can analyze the results found in :code:Si_001.nc. Please see the next page (:ref:visualize) for details. results found in :code:Si_001.nc. Please see the next page (:ref:visualize) for details. \ No newline at end of file ssb-mkin and ssb-run -------------------- Along with the main stemsalabim binary, the ssb-mkin and ssb-run tools are also compiled and put into your bin/ directory. ssb-run can be used to start a STEMsalabim simulation from an existing NetCDF file. Results in the file are discarded and all required parameters are read from the file. Most importantly, the generated atomic displacements for all the frozen phonon configurations are read from the file, so that starting from an NetCDF file ssb-run should always produce the *exact same results*. ssb-mkin is the complementary tool to ssb-run, generating an *input* NetCDF file from a parameter file (see :ref:parameter-file) and a crystal file (see :ref:crystal-file). The output of ssb-mkin is identical to the output of stemsalabim, except that it doesn't contain any results. :: $/path/to/ssb-mkin --params Si_001.cfg --output-file Si_001.nc$ /path/to/ssb-run --params Si_001.nc --num-threads=8 The above two commands are identical to the example in :ref:Si_001. The intermediate Si_001.nc file is small (as it contains no results) and contains everything required to start