Update ase.geometry.rdf.get_rdf to accept chemical symbols as elements and deprecate ase.geometry.analysis.Analysis.get_rdf due to its bugs
Summary
While ase.geometry.analysis.Analysis.get_rdf is a wrapper of ase.geometry.rdf.get_rdf with additional handling of elements, this handling causes a bug for partial RDF calculations. Specifically, elements=['C', 'H'] in ase.geometry.Analysis.get_rdf computes not a partial RDF of C-H but the full RDF of the partial system with C and H, thus including also C-C and H-H contributions.
To compute the partial RDF correctly, we need to give atomic numbers elements=[6, 1], which has however another issue described below.
ase.geometry.rdf.get_rdf calculates partial RDFs correctly (in ASE 3.27.0, after !3921 (merged)). This has however a limitation that elements can so far take only atomic numbers like [6, 1] and not ['C', 'H'], despite its name elements.
The present MR addresses these two issues by:
- Updating
ase.geometry.rdf.get_rdfso that it can take a list of chemical symbols forelements - Deprecating / removing
ase.geometry.analysis.Analysis.get_rdfdue to its bugs
Thank you so much @drFaustroll for reporting the issue!
@askhl May I shortly ask your opinion, particulalry about the deprecation and the removal of ase.geometry.Analysis.get_rdf?
Since the changes are rather straightforward, let me set it as a milestone for ASE 3.28.0.
Details
The bugs of ase.geometry.analysis.Analysis.get_rdf comes from here:
for element in elements:
for idx in self._get_symbol_idxs(image, element):
tmp_image.append(image[idx])
and then
rdf = get_rdf(tmp_image, rmax, nbins, elements=el, no_dists=(not return_dists),
volume=volume)
where el still remains None.
This means that, when we set e.g. elements=['C', 'H'], this method considers a partial system containing only C and H, but then it considers the full RDF for this partial system, which contains contributions also from C-C and H-H.
The partial RDF can be obtained with elements=[6, 1]. However, this also has a potential issue. Here
if all(isinstance(x, int) for x in elements):
if len(elements) == 2:
# use builtin get_rdf mask
el = elements
tmp_image = image
else:
# create dummy image
tmp_image = Atoms(
cell=image.get_cell(), pbc=image.get_pbc())
for idx in elements:
tmp_image.append(image[idx])
This considers a partial RDF if the length of elements is equal to 2; in this case elements are interpreted as an atomic number pair. On the other hand, if the length of elements differs from 2, this is recognized as indices of atoms, and the full RDF for the partial system specified by the indices is computed.
It is, however, in principle possible that we want to give two items for elements as indices of atoms. In the present design, this is misinterpeted as two atomic numbers.
So, there are multiple bugs or issues in ase.geometry.analysis.Analysis.get_rdf.
Also, since the original implementation (!881 (merged)), this part has not been tested.
Also, we can replace most special usages for ase.geometry.analysis.Analysis.get_rdf with ase.geometry.rdf.get_rdf with ASE-standardized manners.
For example, to select one configuration from a list of Atoms (images), instead of
analysis = Analysis(images)
rdf = analysis.get_rdf(rmax, nbins, imageIdx=2)
we can do
rdf = get_rdf(images[2], rmax, nbins)
Also, to compute a full RDF for a partial system, we can do
rdf = get_rdf(atoms[atoms.symbols.search({'C', 'H'})], rmax, nbins)
Having these in mind, I think we can reasonably deprecate ase.geometry.analysis.Analysis.get_rdf and soon remove it (rather quickly, maybe ASE 3.29.0).