Skip to content

CMake script to download force fields at build time

Eliane Briand requested to merge eb-forcefield-download into main

Adds option -DGMX_DOWNLOAD_FORCEFIELDS=YES to download additional force fields at build time.

The associated CMake script downloads the file found at URL GMX_DOWNLOAD_FORCEFIELDS_LIST_URL (user settable, placeholder in the MR, probably should default to something on https://www.gromacs.org/). This file contains a list of forcefield name, URL to find them, and hashes:

DATE 2024-02-23
FF@ 9f31db8502852bc8fcf36497c27db14f1b6509b33701e7cafc2187840b7d2f09  charmm36 charmm36-jul2022.ff http://mackerell.umaryland.edu/download.php?filename=CHARMM_ff_params_files/charmm36-jul2022.ff.tgz @ENDFF
FF@ 203ac30b4b4b02d16dbdff788402d0746c8802a405be482f8f6c2692126ecfa0 amber99sb-star-ildnp amber99sb-star-ildnp.ff https://ftp.gromacs.org/contrib/forcefields/amber99sb-star-ildnp.tgz @ENDFF

Slightly weird format to work around limitations of CMake regexes, improvement very much welcomed. Format is:FF@ , SHA256 of the tarball, name of the forcefield (used as directory name in share/top), name of the top-level directory in the tarball (in case it is different from the forcefield name), URL, @ENDFF

The force field archives are downloaded, hashes checked, and unpacked during the configuration phase. During the install phase, they are placed in share/top, along with the above list file. Those force fields are now available directly in pdb2gmx. Failure to download the list file is a fatal error (as it likely means internet access problem), while the download of the individual force fields fails gracefully (GROMACS build should not break on temporary disturbance at the FF provider webserver).

Concerns

  • Link rot: with a level of indirection, the main worry is keeping the URL of the list file (somewhere on https://www.gromacs.org/) alive, and occasionally checking the URL of the force fields themselves. In the worst case, we can remove FF from the list file, even have an empty list file for out-of-support versions of GROMACS.
  • Updateability: The list file is downloaded, not kept in the archive, so field upgrades are possible (whenever GROMACS is rebuilt, at least)
  • Traceability: the list file is installed in share/top so that the version and URL from where the files were downloaded is traceable.
  • Security/download hash: the tarballs hashes are stored in the list file, and compared during the build.
  • Statistics for force field developper: with the list file scheme, the user is hitting the force field developer website directly - should be fairly transparent to the FF devs.
  • Availability in gmx tools: any force field installed in share/top is available without additional code in pdb2gmx
  • Head node without internet: can either setup a local URL for GMX_DOWNLOAD_FORCEFIELDS_LIST_URL, or install the force field manually in share/top.

Remaining issues mentioned in #4998

  • CMake version and SSL: maybe a CMake version check for this? Or download over HTTP if possible. There is a timeout to the file downloading that might help gracefully fail.
  • The FF developers have to ensure the forcefield.doc first line is informative, as it is displayed in pdb2gmx without any other indication - for instance current charmm36m port is just CHARMM general forcefield, which is sub-optimal.
  • It would be good if they could guarantee some level of stability for the download URL, so that the list file does not have to be updated too often

Related #4998

Merge request reports