Full program characterization is difficult and not always relevant. For example, typical High Performance Computing (HPC) programs generally contain tens of thousands of code lines, if not millions. A better way to perform application characterization is to consider each hot-spot in a separate and more efficient environment. Using profiling tools such as Gprof or Vtune allows an engineer to discern between often executed code and less used segments.
Once the initial profiling step separates the hot-spots i.e. often executed code segments, the engineer focuses on tweaking, optimizing, or even debugging the target code.
A codelet is a small program composed of:
1. The code fragment extracted from the application.
1. The application’s data required for its execution.
1. A wrapper, which loads the original application data, the general memory, and calls the code fragment.
Codelets, by limiting the amount of code considered, reduce compilation, iterative testing, and even profiling time. However, handling each codelet separately reduces the effectiveness, thus a means to handle each in a unified way is necessary. The solution should satisfy three different goals:
Selecting a work set of experiments in the repository
For instance, all the experiments performed on a particular architecture on a selected program
Transforming the data
Most of the time, the user does not want to see the raw data, but prefers to see a statistical summary of it
As an example, for execution run-times times, the interest may only be in the mean and variance
Performing some action on the data
It can be a simple export to a CSV file, visualizing the data as a table or as a plot, or plugging the data in a learning module to build performance models
With a filled codelet base, the tools are systematically tested on many different applications via the extracted codelets. Moreover, an application engineer could present a codelet and retrieve a similar codelet from the database. The extracted codelet could contain information on optimization hints that previously provided a benefit for another application engineer.
CTI stands for Codelet Tuning Infrastructure. The purpose of CTI is three-fold:
Share data (codelets)
Automate menial tasks
Perform data mining techniques
Sharing data is done via files or extracted data and stored in a database. Menial tasks such as scripting, creating back-ups, formatting the data, and performing analysis via the integrated tools are automated. Data mining techniques are automated by using the internal format in CTI and using the existing viewers.
CTI is a data and data processing sharing tool. It is built around the concept of multiple people wishing to share data and data processing techniques. Though it is useful for a single user to store and keep track of experiments and results, it really becomes a useful tool when combined with different users working on the same data-sets and processing techniques. Everything in CTI is defined as being either data or plugins, the element that processes data.
CTI is built on a plugin system, making it flexible and configurable. Basic users can have CTI automate testing or benchmarking using the existing plugins, while expert users may write their own plugins to further modify or extend CTI's base behavior.
CTI supports many characterization tools such as MAQAO, DECAN, Codelet Finder, Likwid, etc. The repository enables automatic compilation and analysis of the extracted codelets. For example, the MAQAO plugin analyzes codelets using the MAQAO tool developed by the Exascale Computing Research laboratory. It extracts low level assembly features from codelets binaries and stores the features vectors back into the repository database. Moreover, a « process » plugin can generically pass all the codelets through different external tools, or even a user’s own scripts.
The repository allows fast indexing of codelets through its database indexing backend. It incorporates a variety of plugins enabling codelet extraction, navigation, and analysis. The Codelet Finder plugin, for example, interfaces the repository with the Codelet Finder tool developed by CAPS Enterprise.
Data for CTI is a general term because it equally represents the files that are imported into the tool, the content of files imported, particular data inserted such as the target machine, the date, etc.
Once CTI is populated with data, the next step a user takes is to process it. Sometimes, the user wishes to extract the data and handle it with a favorite tool. However, there are a lot of pre-defined data processing elements in CTI that can be found in the tutorials.
CTI and CTS
Codelet Tuning Services (CTS) is a web interface. It lets users have a better interface to interact with CTI. Though all things implemented within CTS can be done using the equivalent commands, the web interface makes for a more user-friendly experience.