Skip to content

EnSimiAn - Ensemble Similarity Analysis library

Kameswarrao requested to merge met.3d.internal/met.3d:enSimiAn into master

This MR deals with enSimiAn library written in Python. The library has modules for various stages involved in the generation of clustering, namely, data selection, dimensionality reduction, cluster creation etc. The library also generates sessions and pipelines(under templates directory) that can be read into the compatible Met.3D version. The examples directory has some config files that could be used for testing purposes. The library code is under 'uitls/ensimian/enstools/metcluster' directory. The path of this, if non-standard,needs to be updated to PYTHONPATH. The driver script which is to be executed is 'enSimiAn.py'. This needs to be set in the Met.3D GUI, if the clustering is done interactively. This script needs to be in Met.3D working directory. There are a couple of utility scripts(,py) and configs (.json) located under 'util/ensimian/' that also need to be in Met.3D working directory.

The enstools library from 'https://gitlab.physik.uni-muenchen.de/Kameswarrao.Modali/enstools_mkm' is a pre-requisite for ensimian and the 'enstools/metcluster' needs to be moved inside the 'enstools' library (downloaded from 'https://gitlab.physik.uni-muenchen.de/Kameswarrao.Modali/enstools_mkm') as a package.

In 'Met.3D' mode, the user with the help of the GUI creates the config for the current project / case study and then subsequently selects the 'enSimiAn.py' and executes the creation of the clusters.

In 'enstools' mode, the user manually edits the config file (for example case_vladiana*.json in utils/ensimian/examples). The directory where the data files are located, the reg ex for identifying the ensemble members etc as well as the other parameters need to be set. Subsequently the 'enSimiAn.py' needs to be executed with the config file created as input.

In either of the modes, the output is stored in a hierarchy namely, 'project/case/'. The summary of all the clusters created is written in 'projectname_ClusterSummary.json'. This file is read by the 'ensClusVis' in the next steps for visualizing the robustness.

Apart from this, the sessions and pipelines are created under 'EnSimiAn' directory under standard path for sessions and pipelines of Met.3D. These too follow the 'project/case' hierarchy. The session files are automatically created for the standard deviation and spaghetti plot of the entire ensemble as well as each cluster in a quad or hex view layout based on the number of clusters. The pipeline files are automatically created for the EOFs generated(stored as *.nc files ) for the data used in cluster gneration.

Also, the *ClusterSummary.json file can be read in Met.3D GUI and subsequently the member selection dialog for the variables in a multivaractor, displays the cluster to which each member belongs and this can be sorted and used for selecting the members of a particular cluster.

In interactive mode, the user can create clusters for each change in the timestep & pressure level.

Edited by Kameswarrao

Merge request reports