castep.py 95.5 KB
Newer Older
1
# -*- coding: utf-8 -*-
jensj committed
2
from __future__ import print_function
3 4 5 6 7
"""This module defines an interface to CASTEP for
    use by the ASE (Webpage: http://wiki.fysik.dtu.dk/ase)

Authors:
    Max Hoffmann, max.hoffmann@ch.tum.de
8
    Joerg Meyer, joerg.meyer@ch.tum.de
9 10 11 12 13 14

Contributors:
    Juan M. Lorenzi, juan.lorenzi@tum.de
    Georg S. Michelitsch, georg.michelitsch@tch.tum.de
    Reinhard J. Maurer, reinhard.maurer@yale.edu
    Simon P. Rittmeyer, simon.rittmeyer@tum.de
15 16 17 18 19 20 21 22 23 24 25 26 27 28
"""

from copy import deepcopy
import difflib
import numpy as np
import os
import re
import shutil
import subprocess
import sys
import tempfile
import time

import ase
29
import ase.units as units
30 31 32
from ase.calculators.general import Calculator
from ase.constraints import FixCartesian
from ase.parallel import paropen
33
from ase.utils import basestring
34

Jens Jørgen Mortensen committed
35 36 37 38 39 40 41 42
__all__ = [
    'Castep',
    'CastepCell',
    'CastepParam',
    'create_castep_keywords']

contact_email = 'simon.rittmeyer@tum.de'

Jens Jørgen Mortensen committed
43
# A convenient table to avoid the previously used "eval"
44
_tf_table = {
45
    '': True,  # Just the keyword is equivalent to True
46
    'True': True,
Jens Jørgen Mortensen committed
47 48
    'False': False}

49

50 51 52 53 54 55 56 57 58 59 60 61 62 63
def _self_getter(getf):
    # A decorator that makes it so that if no 'atoms' argument is passed to a
    # getter function, self.atoms is used instead

    def decor_getf(self, atoms=None, *args, **kwargs):

        if atoms is None:
            atoms = self.atoms

        return getf(self, atoms, *args, **kwargs)

    return decor_getf


64
class Castep(Calculator):
65

66 67 68 69 70 71 72 73
    r"""

    CASTEP Interface Documentation

Introduction
============


74
CASTEP_ [1]_ W_ is a software package which uses density functional theory to
75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94
provide a good atomic-level description of all manner of materials and
molecules. CASTEP can give information about total energies, forces and
stresses on an atomic system, as well as calculating optimum geometries, band
structures, optical spectra, phonon spectra and much more. It can also perform
molecular dynamics simulations.

The CASTEP calculator interface class offers intuitive access to all CASTEP
settings and most results. All CASTEP specific settings are accessible via
attribute access (*i.e*. ``calc.param.keyword = ...`` or
``calc.cell.keyword = ...``)


Getting Started:
================

Set the environment variables appropriately for your system.

>>> export CASTEP_COMMAND=' ... '
>>> export CASTEP_PP_PATH=' ... '

95 96 97 98 99 100
Note: alternatively to CASTEP_PP_PATH one can set PSPOT_DIR
      as CASTEP consults this by default, i.e.

>>> export PSPOT_DIR=' ... '


101 102 103 104 105 106 107 108 109 110 111 112

Running the Calculator
======================

The default initialization command for the CASTEP calculator is

.. class:: Castep(directory='CASTEP', label='castep')

To do a minimal run one only needs to set atoms, this will use all
default settings of CASTEP, meaning LDA, singlepoint, etc..

With a generated castep_keywords.py in place all options are accessible
113
by inspection, *i.e.* tab-completion. This works best when using ``ipython``.
jensj committed
114 115 116
All options can be accessed via ``calc.param.<TAB>`` or ``calc.cell.<TAB>``
and documentation is printed with ``calc.param.<keyword> ?`` or
``calc.cell.<keyword> ?``. All options can also be set directly
117
using ``calc.keyword = ...`` or ``calc.KEYWORD = ...`` or even
118
``ialc.KeYwOrD`` or directly as named arguments in the call to the constructor
jensj committed
119
(*e.g.* ``Castep(task='GeometryOptimization')``).
120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167

All options that go into the ``.param`` file are held in an ``CastepParam``
instance, while all options that go into the ``.cell`` file and don't belong
to the atoms object are held in an ``CastepCell`` instance. Each instance can
be created individually and can be added to calculators by attribute
assignment, *i.e.* ``calc.param = param`` or ``calc.cell = cell``.

All internal variables of the calculator start with an underscore (_).
All cell attributes that clearly belong into the atoms object are blocked.
Setting ``calc.atoms_attribute`` (*e.g.* ``= positions``) is sent directly to
the atoms object.


Arguments:
==========

=========================  ====================================================
Keyword                    Description
=========================  ====================================================
``directory``              The relative path where all input and output files
                           will be placed. If this does not exist, it will be
                           created.  Existing directories will be moved to
                           directory-TIMESTAMP unless self._rename_existing_dir
                           is set to false.

``label``                  The prefix of .param, .cell, .castep, etc. files.

=========================  ====================================================


Additional Settings
===================

=========================  ====================================================
Internal Setting           Description
=========================  ====================================================
``_castep_command``        (``=castep``): the actual shell command used to
                           call CASTEP.

``_check_checkfile``       (``=True``): this makes write_param() only
                           write a continue or reuse statement if the
                           addressed .check or .castep_bin file exists in the
                           directory.

``_copy_pspots``           (``=False``): if set to True the calculator will
                           actually copy the needed pseudo-potential (\*.usp)
                           file, usually it will only create symlinks.

168 169 170 171 172 173 174 175 176
``_link_pspots``           (``=True``): if set to True the calculator will
                           actually will create symlinks to the needed pseudo
                           potentials. Set this option (and ``_copy_pspots``)
                           to False if you rather want to access your pseudo
                           potentials using the PSPOT_DIR environment variable
                           that is read by CASTEP.
                           *Note:* This option has no effect if ``copy_pspots``
                           is True..

177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233
``_export_settings``       (``=True``): if this is set to
                           True, all calculator internal settings shown here
                           will be included in the .param in a comment line (#)
                           and can be read again by merge_param. merge_param
                           can be forced to ignore this directive using the
                           optional argument ``ignore_internal_keys=True``.

``_force_write``           (``=True``): this controls wether the \*cell and
                           \*param will be overwritten.

``_prepare_input_only``    (``=False``): If set to True, the calculator will
                           create \*cell und \*param file but not
                           start the calculation itself.
                           If this is used to prepare jobs locally
                           and run on a remote cluster it is recommended
                           to set ``_copy_pspots = True``.

``_castep_pp_path``        (``='.'``) : the place where the calculator
                           will look for pseudo-potential files.

``_rename_existing_dir``   (``=True``) : when using a new instance
                           of the calculator, this will move directories out of
                           the way that would be overwritten otherwise,
                           appending a date string.

``_set_atoms``             (``=False``) : setting this to True will overwrite
                           any atoms object previously attached to the
                           calculator when reading a \.castep file.  By de-
                           fault, the read() function will only create a new
                           atoms object if none has been attached and other-
                           wise try to assign forces etc. based on the atom's
                           positions.  ``_set_atoms=True`` could be necessary
                           if one uses CASTEP's internal geometry optimization
                           (``calc.param.task='GeometryOptimization'``)
                           because then the positions get out of sync.
                           *Warning*: this option is generally not recommended
                           unless one knows one really needs it. There should
                           never be any need, if CASTEP is used as a
                           single-point calculator.

``_track_output``          (``=False``) : if set to true, the interface
                           will append a number to the label on all input
                           and output files, where n is the number of calls
                           to this instance. *Warning*: this setting may con-
                           sume a lot more disk space because of the additio-
                           nal \*check files.

``_try_reuse``             (``=_track_output``) : when setting this, the in-
                           terface will try to fetch the reuse file from the
                           previous run even if _track_output is True. By de-
                           fault it is equal to _track_output, but may be
                           overridden.

                           Since this behavior may not always be desirable for
                           single-point calculations. Regular reuse for *e.g.*
                           a geometry-optimization can be achieved by setting
                           ``calc.param.reuse = True``.
234 235 236
``_pedantic``              (``=False``) if set to true, the calculator will
                           inform about settings probably wasting a lot of CPU
                           time or causing numerical inconsistencies.
237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280

=========================  ====================================================

Special features:
=================


``.dryrun_ok()``
  Runs ``castep_command seed -dryrun`` in a temporary directory return True if
  all variables initialized ok. This is a fast way to catch errors in the
  input. Afterwards _kpoints_used is set.

``.merge_param()``
  Takes a filename or filehandler of a .param file or CastepParam instance and
  merges it into the current calculator instance, overwriting current settings

``.keyword.clear()``
  Can be used on any option like ``calc.param.keyword.clear()`` or
  ``calc.cell.keyword.clear()`` to return to the CASTEP default.

``.initialize()``
  Creates all needed input in the ``_directory``. This can then copied to and
  run in a place without ASE or even python.

``.set_pspot('<library>')``
  This automatically sets the pseudo-potential for all present species to
  *<Species>_<library>.usp*. Make sure that ``_castep_pp_path`` is set
  correctly.

``print(calc)``
  Prints a short summary of the calculator settings and atoms.

``ase.io.castep.read_seed('path-to/seed')``
  Given you have a combination of seed.{param,cell,castep} this will return an
  atoms object with the last ionic positions in the .castep file and all other
  settings parsed from the .cell and .param file. If no .castep file is found
  the positions are taken from the .cell file. The output directory will be
  set to the same directory, only the label is preceded by 'copy_of\_'  to
  avoid overwriting.


Notes/Issues:
==============

281 282
* Currently *only* the FixAtoms *constraint* is fully supported for reading and
  writing. There is some experimental support for the FixCartesian constraint.
283 284 285 286 287 288 289 290

* There is no support for the CASTEP *unit system*. Units of eV and Angstrom
  are used throughout. In particular when converting total energies from
  different calculators, one should check that the same CODATA_ version is
  used for constants and conversion factors, respectively.

.. _CASTEP: http://www.castep.org/

291 292
.. _W: http://en.wikipedia.org/wiki/CASTEP

293 294 295 296
.. _CODATA: http://physics.nist.gov/cuu/Constants/index.html

.. [1] S. J. Clark, M. D. Segall, C. J. Pickard, P. J. Hasnip, M. J. Probert,
       K. Refson, M. C. Payne Zeitschrift für Kristallographie 220(5-6)
297 298 299
       pp.567- 570 (2005) PDF_.

.. _PDF: http://goo.gl/wW50m
300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316


End CASTEP Interface Documentation
    """

    # Class attributes !
    # keys set through atoms object
    atoms_keys = [
        'charge',
        'ionic_constraints',
        'lattice_abs',
        'lattice_cart',
        'positions_abs',
        'positions_abs_final',
        'positions_abs_intermediate',
        'positions_frac',
        'positions_frac_final',
317
        'positions_frac_intermediate']
318 319

    atoms_obj_keys = [
320 321 322 323 324 325 326 327
        'dipole',
        'energy_free',
        'energy_zero',
        'fermi',
        'forces',
        'nbands',
        'positions',
        'stress']
328

329
    internal_keys = [
330 331 332 333 334 335 336 337 338 339 340 341 342 343 344
        '_castep_command',
        '_check_checkfile',
        '_copy_pspots',
        '_link_pspots',
        '_directory',
        '_export_settings',
        '_force_write',
        '_label',
        '_prepare_input_only',
        '_castep_pp_path',
        '_rename_existing_dir',
        '_set_atoms',
        '_track_output',
        '_try_reuse',
        '_pedantic']
345

346
    def __init__(self, directory='CASTEP', label='castep',
347 348 349
                 castep_command=None, check_castep_version=False,
                 castep_pp_path=None,
                 **kwargs):
350 351 352 353 354 355

        self.__name__ = 'Castep'

        # initialize the ase.calculators.general calculator
        Calculator.__init__(self)

356 357
        from ase.io.castep import write_cell
        self._write_cell = write_cell
358

359 360 361
        castep_keywords = import_castep_keywords(castep_command)
        self.param = CastepParam(castep_keywords)
        self.cell = CastepCell(castep_keywords)
362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389

        ###################################
        # Calculator state variables      #
        ###################################
        self._calls = 0
        self._castep_version = castep_keywords.castep_version

        # collects warning from .castep files
        self._warnings = []
        # collects content from *.err file
        self._error = None
        # warnings raised by the ASE interface
        self._interface_warnings = []

        # store to check if recalculation is necessary
        self._old_atoms = None
        self._old_cell = None
        self._old_param = None

        ###################################
        # Internal keys                   #
        # Allow to tweak the behavior     #
        ###################################
        self._opt = {}
        self._castep_command = get_castep_command(castep_command)
        self._castep_pp_path = get_castep_pp_path(castep_pp_path)
        self._check_checkfile = True
        self._copy_pspots = False
390
        self._link_pspots = True
391 392 393 394 395 396 397 398 399 400
        self._directory = os.path.abspath(directory)
        self._export_settings = True
        self._force_write = True
        self._label = label
        self._prepare_input_only = False
        self._rename_existing_dir = True
        self._set_atoms = False
        self._track_output = False
        self._try_reuse = False

401 402 403
        # turn off the pedantic user warnings
        self._pedantic = False

404 405 406 407 408 409 410 411 412 413 414 415
        # will be set on during runtime
        self._seed = None

        ###################################
        # (Physical) result variables     #
        ###################################
        self.atoms = None
        # initialize result variables
        self._forces = None
        self._energy_total = None
        self._energy_free = None
        self._energy_0K = None
416 417 418 419 420 421 422 423 424 425

        # dispersion corrections
        self._dispcorr_energy_total = None
        self._dispcorr_energy_free = None
        self._dispcorr_energy_0K = None

        # spins and hirshfeld volumes
        self._spins = None
        self._hirsh_volrat = None

426 427 428 429 430 431 432 433 434 435 436 437 438 439
        self._number_of_cell_constraints = None
        self._output_verbosity = None
        self._stress = None
        self._unit_cell = None
        self._kpoints = None

        # pointers to other files used at runtime
        self._check_file = None
        self._castep_bin_file = None

        # check version of CASTEP options module against current one
        if check_castep_version:
            local_castep_version = get_castep_version(self._castep_command)
            if not hasattr(self, '_castep_version'):
440
                print('No castep version found')
441 442
                return
            if not local_castep_version == self._castep_version:
443 444 445 446
                print(('The options module was generated from version %s\n'
                       'while your are currently using CASTEP version %s') %
                      (self._castep_version,
                       get_castep_version(self._castep_command)))
447 448 449
                self._castep_version = local_castep_version

        # processes optional arguments in kw style
450
        for keyword, value in kwargs.items():
451 452 453 454 455 456 457 458 459
            # first fetch special keywords issued by ASE CLI
            if keyword == 'kpts':
                self.__setattr__('kpoint_mp_grid', '%s %s %s' % tuple(value))
            elif keyword == 'xc':
                self.__setattr__('xc_functional', str(value))
            elif keyword == 'ecut':
                self.__setattr__('cut_off_energy', str(value))
            else:  # the general case
                self.__setattr__(keyword, value)
460 461 462 463 464 465 466 467 468

    def _castep_find_last_record(self, castep_file):
        """Checks wether a given castep file has a regular
        ending message following the last banner message. If this
        is the case, the line number of the last banner is message
        is return, otherwise False.

        returns (record_start, record_end, end_found, last_record_complete)
        """
469
        if isinstance(castep_file, basestring):
470 471 472 473 474 475 476 477 478 479 480 481 482
            castep_file = paropen(castep_file, 'r')
            file_opened = True
        else:
            file_opened = False
        record_starts = []
        while True:
            line = castep_file.readline()
            if 'Welcome' in line and 'CASTEP' in line:
                record_starts = [castep_file.tell()] + record_starts
            if not line:
                break

        if record_starts == []:
483
            print('Could not find CASTEP label in result file: %s'
484
                  % castep_file)
485
            print('Are you sure this is a .castep file?')
486 487 488 489 490 491 492 493 494 495 496 497 498 499 500
            return

        # search for regular end of file
        end_found = False
        # start to search from record beginning from the back
        # and see if
        record_end = -1
        for record_nr, record_start in enumerate(record_starts):
            castep_file.seek(record_start)
            while True:
                line = castep_file.readline()
                if not line:
                    break
                if 'warn' in line.lower():
                    self._warnings.append(line)
501 502 503 504 505

                # HOTFIX: This string appears twice from CASTEP 7 on and thus
                # prevents reading forces. So, better go for another keyword
                # to indicate the regular end of a run.
                # 'Initialization time' seems to do the job.
506 507
                # if 'Writing analysis data to' in line:
                # if 'Writing model to' in line:
508
                if 'Initialisation time' in line:
509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529
                    end_found = True
                    record_end = castep_file.tell()
                    break

            if end_found:
                break

        if file_opened:
            castep_file.close()

        if end_found:
            # record_nr == 0 corresponds to the last record here
            if record_nr == 0:
                return (record_start, record_end, True, True)
            else:
                return (record_start, record_end, True, False)
        else:
            return (0, record_end, False, False)

    def read(self, castep_file=None):
        """Read a castep file into the current instance."""
530 531 532

        _close = True

533 534 535
        if castep_file is None:
            if self._castep_file:
                castep_file = self._castep_file
536
                out = paropen(castep_file, 'r')
537 538 539 540 541 542
            else:
                print('No CASTEP file specified')
                return
            if not os.path.exists(castep_file):
                print('No CASTEP file found')

543
        elif isinstance(castep_file, basestring):
544 545
            out = paropen(castep_file, 'r')

546 547 548
        else:
            # in this case we assume that we have a fileobj already, but check
            # for attributes in order to avoid extended EAFP blocks.
549
            out = castep_file
550 551 552 553 554 555 556 557 558 559

            # look before you leap...
            attributes = ['name',
                          'seek',
                          'close',
                          'readline',
                          'tell']

            for attr in attributes:
                if not hasattr(out, attr):
Jens Jørgen Mortensen committed
560 561
                    raise TypeError(
                        '"castep_file" is neither str nor valid fileobj')
562

563
            castep_file = out.name
Jens Jørgen Mortensen committed
564
            _close = False
565

566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581
        if self._seed is None:
            self._seed = os.path.splitext(os.path.basename(castep_file))[0]

        err_file = '%s.0001.err' % self._seed
        if os.path.exists(err_file):
            err_file = paropen(err_file)
            self._error = err_file.read()
            err_file.close()
            # we return right-away because it might
            # just be here from a previous run
        # look for last result, if several CASTEP
        # run are appended

        record_start, record_end, end_found, _\
            = self._castep_find_last_record(out)
        if not end_found:
582
            print('No regular end found in %s file' % castep_file)
583
            print(self._error)
584 585
            if _close:
                out.close()
586 587 588 589 590 591 592 593
            return
            # we return here, because the file has no a regular end

        # now iterate over last CASTEP output in file to extract information
        # could be generalized as well to extract trajectory from file
        # holding several outputs
        n_cell_const = 0
        forces = []
594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611

        # HOTFIX:
        # we have to initialize the _stress variable as a zero array
        # otherwise the calculator crashes upon pickling trajectories
        # Alternative would be to raise a NotImplementedError() which
        # is also kind of not true, since we can extract stresses if
        # the user configures CASTEP to print them in the outfile
        # stress = []
        stress = np.zeros([3, 3])
        hirsh_volrat = []

        # Two flags to check whether spin-polarized or not, and whether
        # Hirshfeld volumes are calculated
        spin_polarized = False
        calculate_hirshfeld = False

        positions_frac_list = []

612 613
        out.seek(record_start)
        while True:
614 615
            # TODO: add a switch if we have a geometry optimization: record
            # atoms objects for intermediate steps.
616
            try:
617 618 619 620
                # in case we need to rewind back one line, we memorize the bit
                # position of this line in the file.
                # --> see symops problem below
                _line_start = out.tell()
621 622 623
                line = out.readline()
                if not line or out.tell() > record_end:
                    break
624
                elif 'output verbosity' in line:
625 626 627
                    iprint = int(line.split()[-1][1])
                    if int(iprint) != 1:
                        self.param.iprint = iprint
628 629 630 631 632
                elif 'treating system as spin-polarized' in line:
                    spin_polarized = True
                elif 'treating system as non-spin-polarized' in line:
                    spin_polarized = False
                elif 'Unit Cell' in line:
633 634 635 636 637 638 639 640
                    lattice_real = []
                    lattice_reci = []
                    while True:
                        line = out.readline()
                        fields = line.split()
                        if len(fields) == 6:
                            break
                    for i in range(3):
641 642
                        lattice_real.append([float(f) for f in fields[0:3]])
                        lattice_reci.append([float(f) for f in fields[3:7]])
643 644
                        line = out.readline()
                        fields = line.split()
645
                elif 'Cell Contents' in line:
646 647
                    while True:
                        line = out.readline()
648
                        if 'Total number of ions in cell' in line:
649
                            n_atoms = int(line.split()[7])
650
                        if 'Total number of species in cell' in line:
651
                            int(line.split()[7])
652 653 654
                        fields = line.split()
                        if len(fields) == 0:
                            break
655
                elif 'Fractional coordinates of atoms' in line:
656
                    species = []
657
                    custom_species = None  # A CASTEP special thing
658 659 660 661 662 663 664 665
                    positions_frac = []
                    # positions_cart = []
                    while True:
                        line = out.readline()
                        fields = line.split()
                        if len(fields) == 7:
                            break
                    for n in range(n_atoms):
666 667 668 669 670 671 672 673
                        spec_custom = fields[1].split(':', 1)
                        elem = spec_custom[0]
                        if len(spec_custom) > 1 and custom_species is None:
                            # Add it to the custom info!
                            custom_species = list(species)
                        species.append(elem)
                        if custom_species is not None:
                            custom_species.append(fields[1])
674
                        positions_frac.append([float(s) for s in fields[3:6]])
675 676
                        line = out.readline()
                        fields = line.split()
677 678
                    positions_frac_list.append(positions_frac)
                elif 'Files used for pseudopotentials' in line:
679 680 681 682 683 684 685 686 687 688
                    while True:
                        line = out.readline()
                        if 'Pseudopotential generated on-the-fly' in line:
                            continue
                        fields = line.split()
                        if (len(fields) >= 2):
                            elem, pp_file = fields
                            self.cell.species_pot = (elem, pp_file)
                        else:
                            break
689
                elif 'k-Points For BZ Sampling' in line:
690 691 692 693
                    # TODO: generalize for non-Monkhorst Pack case
                    # (i.e. kpoint lists) -
                    # kpoints_offset cannot be read this way and
                    # is hence always set to None
694 695 696 697
                    while True:
                        line = out.readline()
                        if not line.strip():
                            break
698
                        if 'MP grid size for SCF calculation' in line:
699 700 701
                            # kpoints =  ' '.join(line.split()[-3:])
                            # self.kpoints_mp_grid = kpoints
                            # self.kpoints_mp_offset = '0. 0. 0.'
702 703
                            # not set here anymore because otherwise
                            # two calculator objects go out of sync
704
                            # after each calculation triggering unnecessary
705 706
                            # recalculation
                            break
707
                elif 'Symmetry and Constraints' in line:
708 709 710 711 712
                    # this is a bit of a hack, but otherwise the read_symops
                    # would need to re-read the entire file. --> just rewind
                    # back by one line, so the read_symops routine can find the
                    # start of this block.
                    out.seek(_line_start)
713
                    self.read_symops(castep_castep=out)
714
                elif 'Number of cell constraints' in line:
715
                    n_cell_const = int(line.split()[4])
716
                elif 'Final energy' in line:
717
                    self._energy_total = float(line.split()[-2])
718
                elif 'Final free energy' in line:
719
                    self._energy_free = float(line.split()[-2])
720
                elif 'NB est. 0K energy' in line:
721
                    self._energy_0K = float(line.split()[-2])
722 723 724 725 726 727 728 729 730 731

                # Add support for dispersion correction
                # filtering due to SEDC is done in get_potential_energy
                elif 'Dispersion corrected final energy' in line:
                    self._dispcorr_energy_total = float(line.split()[-2])
                elif 'Dispersion corrected final free energy' in line:
                    self._dispcorr_energy_free = float(line.split()[-2])
                elif 'dispersion corrected est. 0K energy' in line:
                    self._dispcorr_energy_0K = float(line.split()[-2])

732 733 734
                # remember to remove constraint labels in force components
                # (lacking a space behind the actual floating point number in
                # the CASTEP output)
735
                elif '******************** Forces *********************'\
736
                     in line or\
737
                     '************** Symmetrised Forces ***************'\
738 739 740 741
                     in line or\
                     '************** Constrained Symmetrised Forces ****'\
                     '**********'\
                     in line or\
742 743 744
                     '******************** Constrained Forces **********'\
                     '**********'\
                     in line or\
745 746
                     '******************* Unconstrained Forces *********'\
                     '**********'\
747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762
                     in line:
                    fix = []
                    fix_cart = []
                    forces = []
                    while True:
                        line = out.readline()
                        fields = line.split()
                        if len(fields) == 7:
                            break
                    for n in range(n_atoms):
                        consd = np.array([0, 0, 0])
                        fxyz = [0, 0, 0]
                        for (i, force_component) in enumerate(fields[-4:-1]):
                            if force_component.count("(cons'd)") > 0:
                                consd[i] = 1
                            fxyz[i] = float(force_component.replace(
763
                                "(cons'd)", ''))
764 765 766 767 768 769 770
                        if consd.all():
                            fix.append(n)
                        elif consd.any():
                            fix_cart.append(FixCartesian(n, consd))
                        forces.append(fxyz)
                        line = out.readline()
                        fields = line.split()
771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795

                # add support for Hirshfeld analysis
                elif 'Hirshfeld / free atomic volume :' in line:
                    # if we are here, then params must be able to cope with
                    # Hirshfeld flag (if castep_keywords.py matches employed
                    # castep version)
                    calculate_hirshfeld = True
                    hirsh_volrat = []
                    while True:
                        line = out.readline()
                        fields = line.split()
                        if len(fields) == 1:
                            break
                    for n in range(n_atoms):
                        hirsh_atom = float(fields[0])
                        hirsh_volrat.append(hirsh_atom)
                        while True:
                            line = out.readline()
                            if 'Hirshfeld / free atomic volume :' in line or\
                               'Hirshfeld Analysis' in line:
                                break
                        line = out.readline()
                        fields = line.split()

                elif '***************** Stress Tensor *****************'\
796
                     in line or\
797
                     '*********** Symmetrised Stress Tensor ***********'\
798 799 800 801 802 803 804 805
                     in line:
                    stress = []
                    while True:
                        line = out.readline()
                        fields = line.split()
                        if len(fields) == 6:
                            break
                    for n in range(3):
806
                        stress.append([float(s) for s in fields[2:5]])
807 808
                        line = out.readline()
                        fields = line.split()
809 810
                elif ('BFGS: starting iteration' in line or
                      'BFGS: improving iteration' in line):
811 812 813 814 815 816
                    if n_cell_const < 6:
                        lattice_real = []
                        lattice_reci = []
                    species = []
                    positions_frac = []
                    forces = []
817 818 819

                    # HOTFIX:
                    # Same reason for the stress initialization as before
820
                    # stress = []
821 822 823
                    stress = np.zeros([3, 3])

                elif 'BFGS: Final Configuration:' in line:
824 825 826
                    break
                elif 'warn' in line.lower():
                    self._warnings.append(line)
jensj committed
827 828
            except Exception as exception:
                print(line, end=' ')
829
                print('|-> line triggered exception: ' + str(exception))
830
                raise
831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861

        # get the spins in a separate run over the file as we
        # do not want to break the BFGS-break construct
        # probably one can implement it in a more convenient
        # way, but this constructon does the job.

        if spin_polarized:
            spins = []
            out.seek(record_start)
            while True:
                try:
                    line = out.readline()
                    if not line or out.tell() > record_end:
                        break
                    elif 'Atomic Populations' in line:
                        # skip the separating line
                        line = out.readline()
                        # this is the headline
                        line = out.readline()
                        if 'Spin' in line:
                            # skip the next separator line
                            line = out.readline()
                            while True:
                                line = out.readline()
                                fields = line.split()
                                if len(fields) == 1:
                                    break
                                spins.append(float(fields[-1]))
                        break

                except Exception as exception:
Jens Jørgen Mortensen committed
862 863
                    print(line + '|-> line triggered exception: ' +
                          str(exception))
864 865 866 867 868
                    raise
        else:
            # set to zero spin if non-spin polarized calculation
            spins = np.zeros(len(positions_frac))

869 870
        if _close:
            out.close()
871 872 873

        positions_frac_atoms = np.array(positions_frac)
        forces_atoms = np.array(forces)
874 875 876 877 878 879
        spins_atoms = np.array(spins)

        if calculate_hirshfeld:
            hirsh_atoms = np.array(hirsh_volrat)
        else:
            hirsh_atoms = np.zeros_like(spins)
880 881 882

        if self.atoms and not self._set_atoms:
            # compensate for internal reordering of atoms by CASTEP
883 884
            # using the fact that the order is kept within each species

Jens Jørgen Mortensen committed
885
            # positions_frac_ase = self.atoms.get_scaled_positions(wrap=False)
886
            atoms_assigned = [False] * len(self.atoms)
887

Jens Jørgen Mortensen committed
888
            # positions_frac_castep_init = np.array(positions_frac_list[0])
889
            positions_frac_castep = np.array(positions_frac_list[-1])
890

Jens Jørgen Mortensen committed
891
            # species_castep = list(species)
892
            forces_castep = np.array(forces)
893 894 895 896 897 898
            hirsh_castep = np.array(hirsh_volrat)
            spins_castep = np.array(spins)

            # go through the atoms position list and replace
            # with the corresponding one from the
            # castep file corresponding atomic number
899 900
            for iase in range(n_atoms):
                for icastep in range(n_atoms):
901
                    if (species[icastep] == self.atoms[iase].symbol and
902
                            not atoms_assigned[icastep]):
Jens Jørgen Mortensen committed
903 904
                        positions_frac_atoms[iase] = \
                            positions_frac_castep[icastep]
905 906
                        forces_atoms[iase] = np.array(forces_castep[icastep])
                        if iprint > 1 and calculate_hirshfeld:
907
                            hirsh_atoms[iase] = np.array(hirsh_castep[icastep])
908
                        if spin_polarized:
909 910
                            # reordering not necessary in case all spins == 0
                            spins_atoms[iase] = np.array(spins_castep[icastep])
911 912
                        atoms_assigned[icastep] = True
                        break
913 914

            if not all(atoms_assigned):
915
                not_assigned = [i for (i, assigned)
916 917 918
                                in zip(range(len(atoms_assigned)),
                                       atoms_assigned) if not assigned]
                print('%s atoms not assigned.' % atoms_assigned.count(False))
919 920
                print('DEBUGINFO: The following atoms where not assigned: %s' %
                      not_assigned)
921 922
            else:
                self.atoms.set_scaled_positions(positions_frac_atoms)
923 924 925 926 927 928 929 930

        else:
            # If no atoms, object has been previously defined
            # we define it here and set the Castep() instance as calculator.
            # This covers the case that we simply want to open a .castep file.

            # The next time around we will have an atoms object, since
            # set_calculator also set atoms in the calculator.
931 932 933 934
            if self.atoms:
                constraints = self.atoms.constraints
            else:
                constraints = []
935 936 937 938 939 940
            atoms = ase.atoms.Atoms(species,
                                    cell=lattice_real,
                                    constraint=constraints,
                                    pbc=True,
                                    scaled_positions=positions_frac,
                                    )
941 942 943 944
            if custom_species is not None:
                atoms.new_array('castep_custom_species',
                                np.array(custom_species))

945 946 947 948 949
            if self.param.spin_polarized:
                # only set magnetic moments if this was a spin polarized
                # calculation
                atoms.set_initial_magnetic_moments(magmoms=spins_atoms)

950 951 952
            atoms.set_calculator(self)

        self._forces = forces_atoms
Jens Jørgen Mortensen committed
953 954
        # stress in .castep file is given in GPa:
        self._stress = np.array(stress) * units.GPa
955 956
        self._hirsh_volrat = hirsh_atoms
        self._spins = spins_atoms
957

958
        if self._warnings:
959
            print('WARNING: %s contains warnings' % castep_file)
960 961 962 963 964 965
            for warning in self._warnings:
                print(warning)
        # reset
        self._warnings = []

    def read_symops(self, castep_castep=None):
966 967
        # TODO: check that this is really backwards compatible
        # with previous routine with this name...
968
        """Read all symmetry operations used from a .castep file."""
969

970
        if castep_castep is None:
971
            castep_castep = self._seed + '.castep'
972

973
        if isinstance(castep_castep, basestring):
974
            if not os.path.isfile(castep_castep):
975
                print('Warning: CASTEP file %s not found!' % castep_castep)
976
            f = paropen(castep_castep, 'a')
Jens Jørgen Mortensen committed
977
            _close = True
978
        else:
979 980 981 982 983 984 985 986 987 988 989
            # in this case we assume that we have a fileobj already, but check
            # for attributes in order to avoid extended EAFP blocks.
            f = castep_castep

            # look before you leap...
            attributes = ['name',
                          'readline',
                          'close']

            for attr in attributes:
                if not hasattr(f, attr):
Jens Jørgen Mortensen committed
990 991
                    raise TypeError('read_castep_castep_symops: castep_castep '
                                    'is not of type str nor valid fileobj!')
992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006

            castep_castep = f.name
            _close = False

        while True:
            line = f.readline()
            if not line:
                return
            if 'output verbosity' in line:
                iprint = line.split()[-1][1]
                # filter out the default
                if int(iprint) != 1:
                    self.param.iprint = iprint
            if 'Symmetry and Constraints' in line:
                break
1007 1008

        if self.param.iprint is None or self.param.iprint < 2:
1009 1010 1011
            self._interface_warnings.append(
                'Warning: No symmetry'
                'operations could be read from %s (iprint < 2).' % f.name)
1012 1013 1014 1015 1016 1017
            return

        while True:
            line = f.readline()
            if not line:
                break
1018
            if 'Number of symmetry operations' in line:
1019 1020 1021 1022 1023 1024 1025 1026
                nsym = int(line.split()[5])
                # print "nsym = %d" % nsym
                # information about symmetry related atoms currently not read
                symmetry_operations = []
                for _ in range(nsym):
                    rotation = []
                    displacement = []
                    while True:
1027
                        if 'rotation' in f.readline():
1028 1029 1030
                            break
                    for _ in range(3):
                        line = f.readline()
1031
                        rotation.append([float(r) for r in line.split()[1:4]])
1032
                    while True:
1033
                        if 'displacement' in f.readline():
1034 1035
                            break
                    line = f.readline()
1036
                    displacement = [float(d) for d in line.split()[1:4]]
1037
                    symop = {'rotation': rotation,
1038
                             'displacement': displacement}
1039 1040
                    self.symmetry_ops = symop
                self.symmetry = symmetry_operations
1041
                print('Symmetry operations successfully read from %s' % f.name)
jensj committed
1042
                print(self.cell.symmetry_ops)
1043
                break
1044 1045 1046

        # only close if we opened the file in this routine
        if _close:
1047 1048
            f.close()

1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059
    def get_hirsh_volrat(self):
        """
        Return the Hirshfeld volumes.
        """
        return self._hirsh_volrat

    def get_spins(self):
        """
        Return the spins from a plane-wave Mulliken analysis.
        """
        return self._spins
1060 1061 1062 1063 1064 1065 1066

    def set_label(self, label):
        """The label is part of each seed, which in turn is a prefix
        in each CASTEP related file.
        """
        self._label = label

1067
    def set_pspot(self, pspot, elems=None,
1068 1069 1070
                  notelems=None,
                  clear=True,
                  suffix='usp'):
1071
        """Quickly set all pseudo-potentials: Usually CASTEP psp are named
1072
        like <Elem>_<pspot>.<suffix> so this function function only expects
1073 1074 1075 1076
        the <LibraryName>. It then clears any previous pseudopotential
        settings apply the one with <LibraryName> for each element in the
        atoms object. The optional elems and notelems arguments can be used
        to exclusively assign to some species, or to exclude with notelemens.
1077 1078 1079 1080 1081 1082 1083 1084 1085 1086

        Parameters ::

            - elems (None) : set only these elements
            - notelems (None): do not set the elements
            - clear (True): clear previous settings
            - suffix (usp): PP file suffix



1087 1088 1089 1090 1091 1092 1093 1094 1095
        """

        if clear and not elems and not notelems:
            self.cell.species_pot.clear()
        for elem in set(self.atoms.get_chemical_symbols()):
            if elems is not None and elem not in elems:
                continue
            if notelems is not None and elem in notelems:
                continue
1096
            self.cell.species_pot = (elem, '%s_%s.%s' % (elem, pspot, suffix))
1097

1098
    @_self_getter
1099 1100 1101 1102 1103
    def get_forces(self, atoms):
        """Run CASTEP calculation if needed and return forces."""
        self.update(atoms)
        return np.array(self._forces)

1104
    @_self_getter
1105 1106 1107 1108 1109
    def get_total_energy(self, atoms):
        """Run CASTEP calculation if needed and return total energy."""
        self.update(atoms)
        return self._energy_total

1110
    @_self_getter
1111 1112 1113 1114 1115 1116
    def get_free_energy(self, atoms):
        """Run CASTEP calculation if needed and return free energy.
           Only defined with smearing."""
        self.update(atoms)
        return self._energy_free

1117
    @_self_getter
1118 1119 1120 1121 1122 1123
    def get_0K_energy(self, atoms):
        """Run CASTEP calculation if needed and return 0K energy.
           Only defined with smearing."""
        self.update(atoms)
        return self._energy_0K

1124
    @_self_getter
1125
    def get_potential_energy(self, atoms, force_consistent=False):
1126
        # here for compatibility with ase/calculators/general.py
1127
        # but accessing only _name variables
1128 1129 1130
        """Return the total potential energy."""
        self.update(atoms)
        if force_consistent:
1131 1132
            # Assumption: If no dispersion correction is applied, then the
            # respective value will default to None as initialized.
1133
            if self._dispcorr_energy_free is not None:
1134 1135 1136
                return self._dispcorr_energy_free
            else:
                return self._energy_free
1137 1138
        else:
            if self._energy_0K is not None:
1139
                if self._dispcorr_energy_0K is not None:
1140 1141 1142
                    return self._dispcorr_energy_0K
                else:
                    return self._energy_0K
1143
            else:
1144
                if self._dispcorr_energy_total is not None:
1145 1146 1147
                    return self._dispcorr_energy_total
                else:
                    return self._energy_total
1148

1149
    @_self_getter
1150 1151 1152 1153 1154
    def get_stress(self, atoms):
        """Return the stress."""
        self.update(atoms)
        return self._stress

1155
    @_self_getter
1156 1157 1158 1159 1160
    def get_unit_cell(self, atoms):
        """Return the unit cell."""
        self.update(atoms)
        return self._unit_cell

1161
    @_self_getter
1162 1163 1164 1165 1166
    def get_kpoints(self, atoms):
        """Return the kpoints."""
        self.update(atoms)
        return self._kpoints

1167
    @_self_getter
1168 1169 1170 1171
    def get_number_cell_constraints(self, atoms):
        """Return the number of cell constraints."""
        self.update(atoms)
        return self._number_of_cell_constraints
1172
    
1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189
    def set_atoms(self, atoms):
        """Sets the atoms for the calculator and vice versa."""
        atoms.pbc = [True, True, True]
        self.__dict__['atoms'] = atoms.copy()
        self.atoms._calc = self

    def update(self, atoms):
        """Checks if atoms object or calculator changed and
        runs calculation if so.
        """
        if self.calculation_required(atoms):
            self.calculate(atoms)

    def calculation_required(self, atoms, _=None):
        """Checks wether anything changed in the atoms object or CASTEP
        settings since the last calculation using this instance.
        """
Jens Jørgen Mortensen committed
1190
        # SPR: what happens with the atoms parameter here? Why don't we use it?
1191 1192
        # from all that I can tell we need to compare against atoms instead of
        # self.atoms
Jens Jørgen Mortensen committed
1193
        # if not self.atoms == self._old_atoms:
1194
        if not atoms == self._old_atoms:
1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210
            return True
        if self._old_param is None or self._old_cell is None:
            return True
        if not self.param._options == self._old_param._options:
            return True
        if not self.cell._options == self._old_cell._options:
            return True
        return False

    def calculate(self, atoms):
        """Write all necessary input file and call CASTEP."""
        self.prepare_input_files(atoms, force_write=self._force_write)
        if not self._prepare_input_only:
            self.run()
            self.read()

1211
            # we need to push the old state here!
Jens Jørgen Mortensen committed
1212 1213
            # although run() pushes it, read() may change the atoms object
            # again.
1214 1215 1216
            # yet, the old state is supposed to be the one AFTER read()
            self.push_oldstate()

1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244
    def push_oldstate(self):
        """This function pushes the current state of the (CASTEP) Atoms object
        onto the previous state. Or in other words after calling this function,
        calculation_required will return False and enquiry functions just
        report the current value, e.g. get_forces(), get_potential_energy().
        """
        # make a snapshot of all current input
        # to be able to test if recalculation
        # is necessary
        self._old_atoms = self.atoms.copy()
        self._old_param = deepcopy(self.param)
        self._old_cell = deepcopy(self.cell)

    def initialize(self, *args, **kwargs):
        """Just an alias for prepar_input_files to comply with standard
        function names in ASE.
        """
        self.prepare_input_files(*args, **kwargs)

    def prepare_input_files(self, atoms=None, force_write=None):
        """Only writes the input .cell and .param files and return
        This can be useful if one quickly needs to prepare input files
        for a cluster where no python or ASE is available. One can than
        upload the file manually and read out the results using
        Castep().read().
        """

        if self.param.reuse.value is None:
1245 1246 1247
            if self._pedantic:
                print('You have not set e.g. calc.param.reuse = True')
                print('Reusing a previous calculation may save CPU time!\n')
1248 1249 1250 1251
                print(
                    'The interface will make sure by default, a .check exists')
                print(
                    'file before adding this statement to the .param file.\n')
1252
        if self.param.num_dump_cycles.value is None:
1253 1254 1255 1256
            if self._pedantic:
                print('You have not set e.g. calc.param.num_dump_cycles = 0.')
                print('This can save you a lot of disk space. One only needs')
                print('*wvfn* if electronic convergence is not achieved.\n')
1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268
        from ase.io.castep import write_param

        if atoms is None:
            atoms = self.atoms
        else:
            self.atoms = atoms

        if force_write is None:
            force_write = self._force_write

        # if we have new instance of the calculator,
        # move existing results out of the way, first
1269
        if (os.path.isdir(self._directory) and
1270 1271
                self._calls == 0 and
                self._rename_existing_dir):
1272 1273 1274 1275 1276
            if os.listdir(self._directory) == []:
                os.rmdir(self._directory)
            else:
                # rename appending creation date of the directory
                ctime = time.localtime(os.lstat(self._directory).st_ctime)
1277 1278 1279
                os.rename(self._directory, '%s.bak-%s' %
                          (self._directory,
                           time.strftime('%Y%m%d-%H%M%S', ctime)))
1280 1281 1282

        # create work directory
        if not os.path.isdir(self._directory):
1283 1284 1285
            os.makedirs(self._directory, 0o775)

        # we do this every time, not only upon first call
Jens Jørgen Mortensen committed
1286
        # if self._calls == 0:
1287
        self._fetch_pspots()
1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308

        cwd = os.getcwd()
        os.chdir(self._directory)

        # if _try_reuse is requested and this
        # is not the first run, we try to find
        # the .check file from the previous run
        # this is only necessary if _track_output
        # is set to true
        if self._try_reuse and self._calls > 0:
            if os.path.exists(self._check_file):
                self.param.reuse = self._check_file
            elif os.path.exists(self._castep_bin_file):
                self.param.reuse = self._castep_bin_file
        self._seed = self._build_castep_seed()
        self._check_file = '%s.check' % self._seed
        self._castep_bin_file = '%s.castep_bin' % self._seed
        self._castep_file = os.path.abspath('%s.castep' % self._seed)

        # write out the input file
        self._write_cell('%s.cell' % self._seed,
1309 1310
                         self.atoms, castep_cell=self.cell,
                         force_write=force_write)
1311

1312 1313 1314 1315
        if self._export_settings:
            interface_options = self._opt
        else:
            interface_options = None
1316
        write_param('%s.param' % self._seed, self.param,
1317
                    check_checkfile=self._check_checkfile,
1318 1319 1320 1321 1322 1323 1324 1325 1326
                    force_write=force_write,
                    interface_options=interface_options,)
        os.chdir(cwd)

    def _build_castep_seed(self):
        """Abstracts to construction of the final castep <seed>
        with and without _tracking_output.
        """
        if self._track_output:
1327
            return '%s-%06d' % (self._label, self._calls)
1328
        else:
1329
            return '%s' % (self._label)
1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340

    def run(self):
        """Simply call castep. If the first .err file
        contains text, this will be printed to the screen.
        """
        # change to target directory
        cwd = os.getcwd()
        os.chdir(self._directory)
        self._calls += 1

        # run castep itself
1341 1342
        stdout, stderr = shell_stdouterr('%s %s' % (self._castep_command,
                                                    self._seed))
jensj committed
1343
        if stdout:
1344
            print('castep call stdout:\n%s' % stdout)
jensj committed
1345
        if stderr:
1346
            print('castep call stderr:\n%s' % stderr)
1347 1348 1349

        # shouldn't it be called after read()???
        # self.push_oldstate()
1350 1351 1352 1353 1354 1355 1356 1357 1358

        # check for non-empty error files
        err_file = '%s.0001.err' % self._seed
        if os.path.exists(err_file):
            err_file = open(err_file)
            self._error = err_file.read()
            err_file.close()
        os.chdir(cwd)
        if self._error:
1359
            raise RuntimeError(self._error)
1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371

    def __repr__(self):
        """Returns generic, fast to capture representation of
        CASTEP settings along with atoms object.
        """
        expr = ''
        expr += '-----------------Atoms--------------------\n'
        if self.atoms is not None:
            expr += str('%20s\n' % self.atoms)
        else:
            expr += 'None\n'

1372
        expr += '-----------------Param keywords-----------\n'
1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403 1404 1405 1406
        expr += str(self.param)
        expr += '-----------------Cell keywords------------\n'
        expr += str(self.cell)
        expr += '-----------------Internal keys------------\n'
        for key in self.internal_keys:
            expr += '%20s : %s\n' % (key, self._opt[key])

        return expr

    def __getattr__(self, attr):
        """___getattr___ gets overloaded to reroute the internal keys
        and to be able to easily store them in in the param so that
        they can be read in again in subsequent calls.
        """
        if attr in self.internal_keys:
            return self._opt[attr]
        if attr in ['__repr__', '__str__']:
            raise AttributeError
        elif attr not in self.__dict__:
            raise AttributeError
        else:
            return self.__dict__[attr]

    def __setattr__(self, attr, value):
        """We overload the settattr method to make value assignment
        as pythonic as possible. Internal values all start with _.
        Value assigment is case insensitive!
        """

        if attr.startswith('_'):
            # internal variables all start with _
            # let's check first if they are close but not identical
            # to one of the switches, that the user accesses directly
            similars = difflib.get_close_matches(attr, self.internal_keys,
1407
                                                 cutoff=0.9)
1408
            if attr not in self.internal_keys and similars:
1409 1410
                print('Warning: You probably tried one of: %s' % similars)
                print('but typed %s' % attr)
1411 1412 1413 1414 1415
            if attr in self.internal_keys:
                self._opt[attr] = value
                if attr == '_track_output':
                    if value:
                        self._try_reuse = True
1416 1417 1418 1419 1420 1421 1422
                        if self._pedantic:
                            print('You switched _track_output on. This will')
                            print('consume a lot of disk-space. The interface')
                            print('also switched _try_reuse on, which will')
                            print('try to find the last check file. Set')
                            print('_try_reuse = False, if you need')
                            print('really separate calculations')
1423 1424
                    elif '_try_reuse' in self._opt and self._try_reuse:
                        self._try_reuse = False
1425 1426
                        if self._pedantic:
                            print('_try_reuse is set to False, too')
1427 1428 1429
            else:
                self.__dict__[attr] = value
            return
1430
        elif attr in ['atoms', 'cell', 'param']:
1431 1432
            if value is not None:
                if attr == 'atoms' and not isinstance(value, ase.atoms.Atoms):
1433 1434
                    raise TypeError(
                        '%s is not an instance of ase.atoms.Atoms.' % value)
1435
                elif attr == 'cell' and not isinstance(value, CastepCell):
1436 1437
                    raise TypeError('%s is not an instance of CastepCell.' %
                                    value)
1438
                elif attr == 'param' and not isinstance(value, CastepParam):
1439 1440
                    raise TypeError('%s is not an instance of CastepParam.' %
                                    value)
1441 1442 1443 1444 1445 1446 1447 1448
            # These 3 are accepted right-away, no matter what
            self.__dict__[attr] = value
            return
        elif attr in self.atoms_obj_keys:
            # keywords which clearly belong to the atoms object are
            # rerouted to go there
            self.atoms.__dict__[attr] = value
            return
1449
        elif attr in self.atoms_keys:
1450 1451
            # CASTEP keywords that should go into the atoms object
            # itself are blocked
1452 1453
            print('Ignoring setings of "%s", since this has to be set\n'
                  'through the atoms object' % attr)
1454 1455 1456
            return

        attr = attr.lower()
1457 1458
        if attr not in (list(self.cell._options.keys()) +
                        list(self.param._options.keys())):
1459 1460 1461
            # what is left now should be meant to be a castep keyword
            # so we first check if it defined, and if not offer some error
            # correction
1462 1463
            similars = difflib.get_close_matches(
                attr,
1464 1465
                self.cell._options.keys() + self.param._options.keys())
            if similars:
1466 1467
                raise UserWarning('Option "%s" not known! You mean "%s"?' %
                                  (attr, similars[0]))
1468 1469 1470 1471 1472 1473 1474 1475 1476 1477
            else:
                raise UserWarning('Option "%s" is not known!' % attr)

        # here we know it must go into one of the component param or cell
        # so we first determine which one
        if attr in self.param._options.keys():
            comp = 'param'
        elif attr in self.cell._options.keys():
            comp = 'cell'
        else:
1478 1479
            raise UserWarning('Programming error: could not attach '
                              'the keyword to an input file')
1480 1481 1482 1483 1484 1485 1486

        self.__dict__[comp].__setattr__(attr, value)

    def merge_param(self, param, overwrite=True, ignore_internal_keys=False):
        """Parse a param file and merge it into the current parameters."""
        INT_TOKEN = 'ASE_INTERFACE'
        if isinstance(param, CastepParam):
1487
            for key, option in param._options.items():
1488
                if option.value is not None:
1489 1490
                    self.param.__setattr__(key, option.value)
            return
1491

1492
        elif isinstance(param, basestring):
1493
            param_file = open(param, 'r')
1494 1495
            _close = True

1496
        else:
1497 1498 1499 1500 1501 1502 1503 1504 1505 1506 1507
            # in this case we assume that we have a fileobj already, but check
            # for attributes in order to avoid extended EAFP blocks.
            param_file = param

            # look before you leap...
            attributes = ['name',
                          'close'
                          'readlines']

            for attr in attributes:
                if not hasattr(param_file, attr):
Jens Jørgen Mortensen committed
1508 1509
                    raise TypeError('"param" is neither CastepParam nor str '
                                    'nor valid fileobj')
1510 1511 1512 1513

            param = param_file.name
            _close = False

1514 1515 1516
        # ok, we need to load the file beforehand into memory, seems like the
        # easiest way to do the BLOCK handling.
        lines = param_file.readlines()
Jens Jørgen Mortensen committed
1517
        i = 0
1518 1519 1520 1521
        while i < len(lines):
            line = lines[i].strip()

            # note that i will point to the next line from now on
Jens Jørgen Mortensen committed
1522
            i += 1
1523

1524 1525 1526 1527 1528 1529 1530
            # remove comments
            for comment_char in ['#', ';', '!']:
                if comment_char in line:
                    if INT_TOKEN in line:
                        # This block allows to read internal settings from
                        # a *param file
                        iline = line[line.index(INT_TOKEN) + len(INT_TOKEN):]
1531
                        if (iline.split()[0] in self.internal_keys and
1532
                                not ignore_internal_keys):
1533 1534 1535
                            value = ' '.join(iline.split()[1:])
                            if value in _tf_table:
                                self._opt[iline.split()[0]] = _tf_table[value]
1536 1537 1538
                            else:
                                self._opt[iline.split()[0]] = value
                    line = line[:line.index(comment_char)]
1539

1540 1541 1542 1543 1544 1545 1546 1547 1548 1549 1550
            # if nothing remains
            if not line.strip():
                continue

            if line == 'reuse':
                self.param.reuse.value = 'default'
                continue
            if line == 'continuation':
                self.param.continuation.value = 'default'
                continue

1551 1552 1553 1554 1555 1556 1557 1558 1559 1560
            # here comes the handling of the devel block (the only block so far
            # I know to be in the param file)
            if line.upper() == '%BLOCK DEVEL_CODE':
                key = 'devel_code'
                value = ''
                while True:
                    line = lines[i].strip()
                    i += 1
                    if line.upper() == '%ENDBLOCK DEVEL_CODE':
                        break
1561
                    value += '\n{0}'.format(line)
1562 1563
                value = value.strip()

Jens Jørgen Mortensen committed
1564
                if (not overwrite and
1565
                        getattr(self.param, key).value is not None):
1566 1567 1568 1569 1570
                    continue

                self.__setattr__(key, value)
                continue

1571
            try:
1572 1573
                # we go for the regex split here
                key, value = [s.strip() for s in re.split(r'[:=]+', line)]
1574
            except:
1575
                print('Could not parse line %s of your param file: %s'
1576
                      % (i, line))
1577
                raise UserWarning('Seems to me malformed')
1578 1579 1580 1581 1582

            if not overwrite and getattr(self.param, key).value is not None:
                continue
            self.__setattr__(key, value)

1583
        if _close:
1584
            param_file.close()
1585

1586 1587 1588 1589 1590 1591 1592 1593 1594 1595 1596 1597 1598 1599
    def dryrun_ok(self, dryrun_flag='-dryrun'):
        """Starts a CASTEP run with the -dryrun flag [default]
        in a temporary and check wether all variables are initialized
        correctly. This is recommended for every bigger simulation.
        """
        from ase.io.castep import write_param

        temp_dir = tempfile.mkdtemp()
        curdir = os.getcwd()
        self._fetch_pspots(temp_dir)
        os.chdir(temp_dir)
        self._fetch_pspots(temp_dir)
        seed = 'dryrun'

1600 1601
        self._write_cell('%s.cell' % seed, self.atoms,
                         castep_cell=self.cell)
1602 1603
        # This part needs to be modified now that we rely on the new formats.py
        # interface
Jens Jørgen Mortensen committed
1604
        if not os.path.isfile('%s.cell' % seed):
1605
            print('%s.cell not written - aborting dryrun' % seed)
1606 1607 1608
            return
        write_param('%s.param' % seed, self.param, )

1609
        stdout, stderr = shell_stdouterr(('%s %s %s' % (self._castep_command,
1610 1611
                                                        seed,
                                                        dryrun_flag)))
1612

1613 1614 1615 1616
        if stdout:
            print(stdout)
        if stderr:
            print(stderr)
1617 1618 1619 1620 1621 1622 1623 1624 1625 1626 1627 1628 1629 1630 1631 1632 1633 1634 1635 1636 1637 1638 1639 1640 1641 1642 1643 1644 1645 1646 1647 1648 1649 1650 1651 1652 1653 1654 1655 1656 1657 1658 1659 1660 1661 1662 1663 1664 1665 1666 1667 1668 1669 1670
        result_file = open('%s.castep' % seed)

        txt = result_file.read()
        ok_string = r'.*DRYRUN finished.*No problems found with input files.*'
        match = re.match(ok_string, txt, re.DOTALL)

        try:
            self._kpoints_used = int(
                re.search(
                    r'Number of kpoints used = *([0-9]+)', txt).group(1))
        except:
            print('Couldn\'t fetch number of kpoints from dryrun CASTEP file')

        err_file = '%s.0001.err' % seed
        if match is None and os.path.exists(err_file):
            err_file = open(err_file)
            self._error = err_file.read()
            err_file.close()

        result_file.close()
        os.chdir(curdir)
        shutil.rmtree(temp_dir)

        # re.match return None is the string does not match
        return match is not None

    # this could go into the Atoms() class at some point...
    def _get_number_in_species(self, at, atoms=None):
        """Return the number of the atoms within the set of it own
        species. If you are an ASE commiter: why not move this into
        ase.atoms.Atoms ?"""
        if atoms is None:
            atoms = self.atoms
        numbers = atoms.get_atomic_numbers()
        n = numbers[at]
        nis = numbers.tolist()[:at + 1].count(n)
        return nis

    def _get_absolute_number(self, species, nic, atoms=None):
        """This is the inverse function to _get_number in species."""
        if atoms is None:
            atoms = self.atoms
        ch = atoms.get_chemical_symbols()
        ch.reverse()
        total_nr = 0
        assert nic > 0, 'Number in species needs to be 1 or larger'
        while True:
            if ch.pop() == species:
                if nic == 1:
                    return total_nr
                nic -= 1
            total_nr += 1

    def _fetch_pspots(self, directory=None):
1671
        """Put all specified pseudo-potentials into the working directory.
1672
        """
1673 1674
        # should be a '==' right? Otherwise setting _castep_pp_path is not
        # honored.
Jens Jørgen Mortensen committed
1675
        if (not os.environ.get('PSPOT_DIR', None) and
1676
                self._castep_pp_path == os.path.abspath('.')):
Jens Jørgen Mortensen committed
1677 1678 1679 1680 1681 1682 1683 1684 1685 1686
            # By default CASTEP consults the environment variable
            # PSPOT_DIR. If this contains a list of colon separated
            # directories it will check those directories for pseudo-
            # potential files if not in the current directory.
            # Thus if PSPOT_DIR is set there is nothing left to do.
            # If however PSPOT_DIR was been accidentally set
            # (e.g. with regards to a different program)
            # setting CASTEP_PP_PATH to an explicit value will
            # still be honored.
            return
1687

1688 1689 1690
        if directory is None:
            directory = self._directory
        if not os.path.isdir(self._castep_pp_path):
1691
            print('PSPs directory %s not found' % self._castep_pp_path)
1692 1693 1694 1695 1696 1697 1698 1699
        pspots = {}
        if self.cell.species_pot.value is not None:
            for line in self.cell.species_pot.value.split('\n'):
                line = line.split()
                if line:
                    pspots[line[0]] = line[1]
        for species in self.atoms.get_chemical_symbols():
            if not pspots or species not in pspots.keys():
1700
                if self._pedantic:
Jens Jørgen Mortensen committed
1701 1702
                    print('Warning: you have no PP specified for %s.' %
                          species)