Skip to content

Update to cif.py: Clarifying how disordered atoms are read from cif files by ASE

The current cif.py file reads .cif files absolutely fine, however .cif files can contain disorder in them. This can occur in parts of the crystal where atoms can be positioned in numerous places, such as long chain aliphatic (CH2)x-CH3 groups. The various positions of disordered atoms observed by the crystallographer are often recorded to the .cif file. This results in (disordered) atoms that are recorded multiple times in the .cif file in the various positions it has been recorded in.

When ASE currently reads .cif files that have disordered atoms, all atoms are read, including those disordered atoms that are recorded multiple times in the .cif file in various positions. This results in very odd looking molecules in the crystal that contains more atoms than it should have.

This update includes methods that use the _atom_site_disorder_group and _atom_site_disorder_assembly blocks given in the .cif file (if these black are given in the .cif file) to allow cif.py to give a non-disordered Atoms ase object.

The user is able to tell the ase.io.read method how to read .cif files to given a non-disordered crystal using the disorder_groups variable:

system = read('RAPHID.cif',disorder_groups=disorder_groups)

Possible options for disorder_groups:

  • disorder_groups=-1 (Default): Read in all atoms from the .cif file. Currently is what the original cif.py file is already doing.
  • disorder_groups=-2: Read in all non-disordered atoms, as well as atoms that are associated with the lowest value of _atom_site_disorder_group in the crystal.
  • disorder_groups= an integer above and including 0: Read in all non-disordered atoms, as well as atoms that are associated with the disorder_groups _atom_site_disorder_group in the crystal.
  • disorder_groups = List of integers that are above and including 0: Read in all non-disordered atoms, as well as atoms that are associated with any disorder_group _atom_site_disorder_group (in the crystal) that has been given in the disorder_groups list (where disorder_groups = [disorder_group for disorder_group in disorder_groups]).
  • disorder_groups = List of str.: Read in all non-disordered atoms, as well as atoms that are associated with the _atom_site_disorder_assembly and _atom_site_disorder_group values that you want to read in. For example: ['A1','B2','C1']

Tests provided in a zip file below:

ase_cif_test.zip


New Methods:

  • _get_atom_indices_in_disorder_group <-- (major new method)
  • _does_cif_file_contain_atom_site_disorder_group_block(self)
  • _get_entries_in_atom_site_disorder_group_block(self)
  • _does_cif_file_contain_atom_site_disorder_assembly_block(self)
  • _get_entries_in_atom_site_disorder_assembly(self)

Modified Methods:

  • read_cif
  • get_atoms
  • get_unsymmetrized_structure

References:

_atom_site_disorder_group: https://www.iucr.org/__data/iucr/cifdic_html/1/cif_core.dic/Iatom_site_disorder_group.html _atom_site_disorder_assembly: https://www.iucr.org/__data/iucr/cifdic_html/1/cif_core.dic/Iatom_site_disorder_assembly.html

Edited by Geoffrey Weal

Merge request reports