Clarifying how ASE deals with disordered atoms when reading in .cif files
The current cif.py file reads .cif files absolutely fine, however .cif files can contain disorder in them. This can occur in parts of the crystal where atoms can be positioned in numerous places, such as long chain aliphatic -(CH2)x-CH3 groups. The various positions of disordered atoms observed by the crystallographer are often recorded to the .cif file. This results in disordered atoms that are recorded multiple times in the .cif file (in multiple positions).
When ASE currently reads .cif files that have disordered atoms, all atoms are read, including those disordered atoms that are recorded multiple times in the .cif file in various positions. This results in very odd looking molecules in the crystal that contains more atoms than it should have.
This update includes methods that use either:
- the
_atom_site_disorder_group
and_atom_site_disorder_assembly
blocks given in the .cif file (if these blocks are given in the .cif file), or - the
_atom_site_label
block given in the .cif file
This allows cif.py to give a non-disordered ase.Atoms object.
The user is able to tell the ase.io.read
method how to read .cif files to give a non-disordered crystal using the disorder_groups
variable:
system = read('RAPHID.cif',disorder_groups=disorder_groups)
Possible options for disorder_groups
:
-
disorder_groups
=-1 (Default): Read in all atoms from the .cif file, including all disordered atoms. Currently is what the original cif.py file is already doing. -
disorder_groups='remove_disorder'
: Remove all disordered atoms in the crystal. This setting is the recommended setting if you do not want disorder in your crystal when read by ASE. -
disorder_groups='tag_disorder'
: Tag all disordered atoms in the crystal. This setting is the recommended setting if you want to tag all the disordered atoms in your crystal when read by ASE. -
disorder_groups
=-2: Read in all non-disordered atoms, as well as atoms that are associated with the lowest value of_atom_site_disorder_group
in the crystal. -
disorder_groups
= an integer above and including 0: Read in all non-disordered atoms, as well as atoms that are associated with thedisorder_groups
_atom_site_disorder_group
in the crystal. -
disorder_groups
= List of integers that are above and including 0: Read in all non-disordered atoms, as well as atoms that are associated with anyd
_atom_site_disorder_group
in the crystal that has been given in thedisorder_groups
list (wheredisorder_groups = [d for d in disorder_groups
]). -
disorder_groups
= List of str.: Read in all non-disordered atoms, as well as atoms that are associated with the_atom_site_disorder_assembly
and_atom_site_disorder_group
values that you want to read in. For example:['A1','B2','C1']
-
disorder_groups='tag_?'
: Tag all atoms as 0 except for those with a?
at the start of end of the atom label, which are tagged as 1. -
disorder_groups='remove_?'
: Keep all atoms except for those with a?
at the start of end of the atom label, which are removed.
Tests provided in a zip file below in the comments.
New Methods:
-
_get_atom_indices_in_disorder_group(self, disorder_groups)
<-- (major new method) -
_tag_or_remove_atoms_with_question_mark_in_their_labels(self, operator)
<-- (major new method) _does_cif_file_contain_atom_site_disorder_group_block(self)
_get_entries_in_atom_site_disorder_group_block(self)
_does_cif_file_contain_atom_site_disorder_assembly_block(self)
_get_entries_in_atom_site_disorder_assembly(self)
Modified Methods:
read_cif
get_atoms
get_unsymmetrized_structure
References:
-
_atom_site_disorder_group
: https://www.iucr.org/__data/iucr/cifdic_html/1/cif_core.dic/Iatom_site_disorder_group.html -
_atom_site_disorder_assembly
: https://www.iucr.org/__data/iucr/cifdic_html/1/cif_core.dic/Iatom_site_disorder_assembly.html -
_atom_site_label
: https://www.iucr.org/__data/iucr/cifdic_html/3/CORE_DIC/Iatom_site.label.html