Commit 3c3dd362 authored by Dante Sblendorio's avatar Dante Sblendorio

Initial commit

parents
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Introduction\n",
"A common obstacle in the simulation world is the incompatability between the file types of different programs. Python can be used to effeciently bridge this gap. For this code, we will be converting .vasp files into LAMMPS configuration files.\n",
"\n",
"# Prerequisutes\n",
"For this code to work:\n",
"<ol>\n",
"1. This code is tailored specifically for .vasp file formats. <br>\n",
"2. The structure must be orthogonal.<br>\n",
"3. The dictionary that contains atomic masses must contain the element of interest.\n",
"</ol>\n",
"\n",
"## Atomic mass dictionary\n",
"The atomic weight for our specific atoms of interest are contained in a data structure called a dictionary. A Python dictionary is a collection which is unordered, changeable, and indexed. These structures are called with curly brackets. Items in a dictionary are saved by referring to the key name. In this example of our atomic weight dictionary, key is the element symbol and the value is the atomic weight."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"atomic_weight = {'Cu':63.546,'Zr':91.224,\n",
" 'Na':22.989, 'Cl':35.453,\n",
" 'C':12.011}"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Note: The Python package called **Mendeleev** is convenient for accessing various properties of elements, ions and isotopes in the periodic table of elements.\n",
"+ The package can be installed with: `pip install mendeleev`\n",
"+ The element data can be accessed by calling: `from mendeleev import element`\n",
"+ An attribute of an element (e.g. atomic mass) can be accessed by: `element('elementSymbol').attribute` (e.g. `element('Si').mass`)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Define file paths\n",
"The export path is defined with the same filename as the import path."
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"fileName = 'liquid_configuration_CuZr'\n",
"importPath = './input-files/' + fileName + '.POSCAR.vasp'\n",
"exportPath = './generated-data/' + fileName + '.lammps'"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Coding\n",
"## Begin by reading through the .vasp file\n",
"Here we're telling Python to open the file defined by importPath to open it in a variable called File. and to read the lines in the file, saving the lines in a variable called lines."
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [],
"source": [
"lines = open(importPath,'r').readlines()\n",
"# lines = File.readlines()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Reformat the matrix representation\n",
"Lines 2-4 of our file tells us the size of the simulation box, which can be read as the following diagonal matrix.\n",
"\n",
"$$\\begin{bmatrix}\n",
"x & 0 & 0\\\\\n",
"0 & y & 0\\\\\n",
"0 & 0 & z\n",
"\\end{bmatrix}$$\n",
"\n",
"To save this simulation box size information, we can define one generalised line of code per dimension:<br>\n",
"> `matrix['dimension'] = float(lines[N].split()[M])` <br>\n",
"This line of code performs four operations in this order: \n",
"1. Calling line[N] that contains our matrix information, where N = {2,3,4} for .vasp files.\n",
"2. Splitting line[N] into a list of strings with the command .split()\n",
"3. Selecting item M in list[N] that is the length of the given dimension, where M = {0,1,2}\n",
"4. Converting the string into a numerical float."
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [],
"source": [
"matrix = {}\n",
"matrix['x'] = float(lines[2].split()[0])\n",
"matrix['y'] = float(lines[3].split()[1])\n",
"matrix['z'] = float(lines[4].split()[2])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Save the element information\n",
"Lines 5 & 6 of a .vasp file contains the information of which elements and how many of each elements are in the structure, respectively. This information will be saved into a global dictionary called `atomsPresent`. Each element will have its own dictionary to contain the relevant information with the following properties: the atom type `atomType`, how many of each element there are `counts`, and the atomic weight `mass`."
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{'Cu': {'atomType': 1, 'counts': 1500, 'mass': 63.546}, 'Zr': {'atomType': 2, 'counts': 1500, 'mass': 91.224}}\n"
]
}
],
"source": [
"elements = lines[5].split()\n",
"counts = [int(x) for x in lines[6].split()]\n",
"atomsPresent = {}\n",
"for i in range(len(elements)):\n",
" key = elements[i]\n",
" atomsPresent[key] = {}\n",
" atomsPresent[key]['atomType'] = i+1\n",
" atomsPresent[key]['counts'] = counts[i]\n",
" atomsPresent[key]['mass'] = atomic_weight[key]\n",
"print(atomsPresent)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Line 8 of a .vasp file is the beginning of the atomic coordinates. Each line has extra white spaces. So for each line, we split the string into a list of strings, then join with a single whitespace. And to this string with a single whitespace, we add the three zeroes for the image flags"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [],
"source": [
"coordinates = []\n",
"for i in range(8,len(lines)):\n",
" coordinates.append(' '.join(lines[i].split()) + ' 0 0 0 \\n')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Practice printing a file\n",
"A couple notes for printing strings:\n",
"+ `\\n` introduces a line break (like pressing Return)\n",
"+ The `%` operator is used to format a set of variables enclosed in a fixed size list. Numbers are formatted with specific specifiers: `%d` decimal integer, `%f`\n",
" for displaying fixed point numbers."
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"LAMMPS data input file via convert-vasp-to-data script\n",
"\n",
"\n",
"3000 atoms\n",
"\n",
"2 atom types\n",
"\n",
"\n",
"0.000000000000000 36.100000000000001 xlo xhi\n",
"\n",
"0.000000000000000 36.100000000000001 ylo yhi\n",
"\n",
"0.000000000000000 36.100000000000001 zlo zhi\n",
"\n",
"\n",
"Masses\n",
"\n",
"\n",
"1 63.546\n",
"\n",
"2 91.224\n",
"\n",
"\n",
"Atoms # atomic\n",
"\n",
"\n",
"1 1 2.7329065133084156e+00 3.7426188736951818e+00 1.2094101620064606e+00 0 0 0 \n",
"\n",
"2 1 1.3882770829287838e+00 1.8164528410009022e+00 5.8480762073571668e-03 0 0 0 \n",
"\n",
"3 2 4.9890652441194847e+00 3.8142665537358242e+00 2.2460378330431001e+00 0 0 0 \n",
"\n",
"4 2 7.1339923525651647e+00 1.0673433383986382e+00 3.0609066766632593e+00 0 0 0 \n",
"\n"
]
}
],
"source": [
"print('LAMMPS data input file via convert-vasp-to-data script\\n\\n')\n",
"print('%d atoms\\n' % (len(coordinates)))\n",
"print('%d atom types\\n\\n' % (len(atomsPresent)))\n",
"print('%1.15f %1.15f xlo xhi\\n' % (0,matrix['x']))\n",
"print('%1.15f %1.15f ylo yhi\\n' % (0,matrix['y']))\n",
"print('%1.15f %1.15f zlo zhi\\n\\n' % (0,matrix['z']))\n",
"print('Masses\\n\\n')\n",
"for i in range(len(atomsPresent)):\n",
" print('%d %1.3f\\n' % (atomsPresent[elements[i]]['atomType'],atomsPresent[elements[i]]['mass']))\n",
" \n",
"index = 0\n",
"atomID = 1\n",
"print('\\nAtoms # atomic\\n\\n')\n",
"for i in range(len(atomsPresent)):\n",
" key = elements[i]\n",
" for j in range(2):\n",
" print('%d %d ' % (atomID, atomsPresent[key]['atomType']) + coordinates[index])\n",
" index += 1\n",
" atomID += 1"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Writing a file\n",
"To print a file, we takke our practice `print` cell a replace print with `f.write`. Also note, that the second for loop has been replaced from `for j in range(2):` (which only prints the first two atoms of each elements) to `for j in range(atomsPresent[key]['counts'])` (which now prints all the atoms for each of the elements)."
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"File has been printed to: ./generated-data/liquid_configuration_CuZr.lammps\n"
]
}
],
"source": [
"f = open(exportPath,'w')\n",
"f.write('LAMMPS data input file via convert-vasp-to-data script\\n\\n')\n",
"f.write('%d atoms\\n' % (len(coordinates)))\n",
"f.write('%d atom types\\n\\n' % (len(atomsPresent)))\n",
"f.write('%1.15f %1.15f xlo xhi\\n' % (0,matrix['x']))\n",
"f.write('%1.15f %1.15f ylo yhi\\n' % (0,matrix['y']))\n",
"f.write('%1.15f %1.15f zlo zhi\\n\\n' % (0,matrix['z']))\n",
"f.write('Masses\\n\\n')\n",
"for i in range(len(atomsPresent)):\n",
" f.write('%d %1.3f\\n' % (atomsPresent[elements[i]]['atomType'],atomsPresent[elements[i]]['mass']))\n",
" \n",
"index = 0\n",
"atomID = 1\n",
"f.write('\\nAtoms # atomic\\n\\n')\n",
"for i in range(len(atomsPresent)):\n",
" key = elements[i]\n",
" for j in range(atomsPresent[key]['counts']):\n",
" f.write('%d %d ' % (atomID, atomsPresent[key]['atomType']) + coordinates[index])\n",
" index += 1\n",
" atomID += 1\n",
"f.close()\n",
"print('File has been printed to: ',exportPath)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.6"
}
},
"nbformat": 4,
"nbformat_minor": 4
}
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Introduction\n",
"A common obstacle in the simulation world is the incompatability between the file types of different programs. Python can be used to effeciently bridge this gap. In this code, we convert .vasp files into LAMMPS configuration files.\n",
"\n",
"# Prerequisites\n",
"For this code to work:\n",
"<ol>\n",
"1. This code is tailored specifically for .vasp file formats, However, it can easily be adopted for any format. <br>\n",
"2. The structure must be orthogonal.<br>\n",
"3. The dictionary that contains atomic masses must contain the element of interest.\n",
"</ol>\n",
"\n",
"# The Code\n",
"## Atomic mass dictionary\n",
"The atomic weight for our specific atoms of interest are contained in a data structure called a dictionary. A Python dictionary is a collection that is unordered, changeable, and indexed. These structures are called with curly brackets. Items in a dictionary are saved by referring to the key name. In this example of our atomic weight dictionary, the key is the element (string) and the value is the atomic weight (float)."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"atomic_weight = {'Cu':63.546,'Zr':91.224,\n",
" 'Na':22.989, 'Cl':35.453,\n",
" 'C':12.011}"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Note: The Python package called **Mendeleev** is convenient for accessing various properties of elements, ions and isotopes in the periodic table of elements.\n",
"\n",
"+ The package can be installed using pip in command prompt: `pip install mendeleev`\n",
"+ The element data can be accessed by calling: `from mendeleev import element`\n",
"+ An attribute of an element (e.g. atomic mass) can be accessed by: `element('elementSymbol').attribute` (e.g. `element('Si').mass`)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Define file paths\n",
"The export path is defined with the same filename as the import path."
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"fileName = 'liquid_configuration_CuZr'\n",
"importPath = './input-files/' + fileName + '.POSCAR.vasp'\n",
"exportPath = './generated-data/' + fileName + '.lammps'"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Begin by reading through the .vasp file\n",
"The line of code to read through the file performs two operations in the following order:\n",
"\n",
"1. Open and read the file located at `importPath`.<br>\n",
"2. Return a list containing each line in the file as a list item using the `readlines()` method."
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"lines = open(importPath,'r').readlines()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Reformat the matrix representation\n",
"Lines 2-4 of our file tells us the size of the simulation box, which can be read as the following diagonal matrix.\n",
"\n",
"$$\\begin{bmatrix}\n",
"x & 0 & 0\\\\\n",
"0 & y & 0\\\\\n",
"0 & 0 & z\n",
"\\end{bmatrix}$$\n",
"\n",
"To save this simulation box size information, we can define one generalised line of code per dimension: `matrix['dimension'] = float(lines[N].split()[M])` \n",
"\n",
"This line of code performs four operations in this order: \n",
"\n",
"1. Calling `line[N]` that contains our matrix information, where $N = \\{2,3,4\\}$ for .vasp files.\n",
"2. Splitting `line[N]` into a list of strings with the `.split()` operator.\n",
"3. Selecting item $M$ in `list[N]` that is the length of the given dimension, where $M = \\{0,1,2\\}$\n",
"4. Converting the string into a numerical float."
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [],
"source": [
"matrix = {}\n",
"matrix['x'] = float(lines[2].split()[0])\n",
"matrix['y'] = float(lines[3].split()[1])\n",
"matrix['z'] = float(lines[4].split()[2])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Save the element information\n",
"Lines 5 & 6 of a .vasp file contains the information of which elements and how many of each elements are in the structure, respectively. This information will be saved into a global dictionary called `atomsPresent`. Each element will have its own dictionary to contain the relevant information with the following properties: the atom type `atomType`, how many of each element there are `counts`, and the atomic weight `mass`."
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{'Cu': {'atomType': 1, 'counts': 1500, 'mass': 63.546}, 'Zr': {'atomType': 2, 'counts': 1500, 'mass': 91.224}}\n"
]
}
],
"source": [
"elements = lines[5].split()\n",
"counts = [int(x) for x in lines[6].split()]\n",
"atomsPresent = {}\n",
"for i in range(len(elements)):\n",
" key = elements[i]\n",
" atomsPresent[key] = {}\n",
" atomsPresent[key]['atomType'] = i+1\n",
" atomsPresent[key]['counts'] = counts[i]\n",
" atomsPresent[key]['mass'] = atomic_weight[key]\n",
"print(atomsPresent)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Line 8 of a .vasp file is the beginning of the atomic coordinates. Each line has extra white spaces. So for each line, we split the string into a list of strings, then join with a single whitespace. And to this string with a single whitespace, we add the three zeroes for the image flags"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [],
"source": [
"coordinates = []\n",
"for i in range(8,len(lines)):\n",
" coordinates.append(' '.join(lines[i].split()) + ' 0 0 0 \\n')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Practice printing a file\n",
"A couple notes for printing strings:\n",
"\n",
"+ `\\n` introduces a line break (like pressing Return)\n",
"+ The `%` operator is used to format a set of variables enclosed in a fixed size list. Numbers are formatted with specific specifiers: `%d` decimal integer, `%f` for displaying fixed point numbers."
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"LAMMPS data input file via convert-vasp-to-data script\n",
"\n",
"\n",
"3000 atoms\n",
"\n",
"2 atom types\n",
"\n",
"\n",
"0.000000000000000 36.100000000000001 xlo xhi\n",
"\n",
"0.000000000000000 36.100000000000001 ylo yhi\n",
"\n",
"0.000000000000000 36.100000000000001 zlo zhi\n",
"\n",
"\n",
"Masses\n",
"\n",
"\n",
"1 63.546\n",
"\n",
"2 91.224\n",
"\n",
"\n",
"Atoms # atomic\n",
"\n",
"\n",
"1 1 2.7329065133084156e+00 3.7426188736951818e+00 1.2094101620064606e+00 0 0 0 \n",
"\n",
"2 1 1.3882770829287838e+00 1.8164528410009022e+00 5.8480762073571668e-03 0 0 0 \n",
"\n",
"3 2 4.9890652441194847e+00 3.8142665537358242e+00 2.2460378330431001e+00 0 0 0 \n",
"\n",
"4 2 7.1339923525651647e+00 1.0673433383986382e+00 3.0609066766632593e+00 0 0 0 \n",
"\n"
]
}
],
"source": [
"print('LAMMPS data input file via convert-vasp-to-data script\\n\\n')\n",
"print('%d atoms\\n' % (len(coordinates)))\n",
"print('%d atom types\\n\\n' % (len(atomsPresent)))\n",
"print('%1.15f %1.15f xlo xhi\\n' % (0,matrix['x']))\n",
"print('%1.15f %1.15f ylo yhi\\n' % (0,matrix['y']))\n",
"print('%1.15f %1.15f zlo zhi\\n\\n' % (0,matrix['z']))\n",
"print('Masses\\n\\n')\n",
"for i in range(len(atomsPresent)):\n",
" print('%d %1.3f\\n' % (atomsPresent[elements[i]]['atomType'],atomsPresent[elements[i]]['mass']))\n",
" \n",
"index = 0\n",
"atomID = 1\n",
"print('\\nAtoms # atomic\\n\\n')\n",
"for i in range(len(atomsPresent)):\n",
" key = elements[i]\n",
" for j in range(2):\n",
" print('%d %d ' % (atomID, atomsPresent[key]['atomType']) + coordinates[index])\n",
" index += 1\n",
" atomID += 1"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Writing a file\n",
"To print a file, we takke our practice `print` cell a replace print with `f.write`. Also note, that the second for loop has been replaced from `for j in range(2):` (which only prints the first two atoms of each elements) to `for j in range(atomsPresent[key]['counts'])` (which now prints all the atoms for each of the elements)."
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"File has been printed to: ./generated-data/liquid_configuration_CuZr.lammps\n"
]
}
],
"source": [
"f = open(exportPath,'w')\n",
"f.write('LAMMPS data input file via convert-vasp-to-data script\\n\\n')\n",
"f.write('%d atoms\\n' % (len(coordinates)))\n",
"f.write('%d atom types\\n\\n' % (len(atomsPresent)))\n",
"f.write('%1.15f %1.15f xlo xhi\\n' % (0,matrix['x']))\n",
"f.write('%1.15f %1.15f ylo yhi\\n' % (0,matrix['y']))\n",
"f.write('%1.15f %1.15f zlo zhi\\n\\n' % (0,matrix['z']))\n",
"f.write('Masses\\n\\n')\n",
"for i in range(len(atomsPresent)):\n",
" f.write('%d %1.3f\\n' % (atomsPresent[elements[i]]['atomType'],atomsPresent[elements[i]]['mass']))\n",
" \n",
"index = 0\n",
"atomID = 1\n",
"f.write('\\nAtoms # atomic\\n\\n')\n",
"for i in range(len(atomsPresent)):\n",
" key = elements[i]\n",
" for j in range(atomsPresent[key]['counts']):\n",
" f.write('%d %d ' % (atomID, atomsPresent[key]['atomType']) + coordinates[index])\n",
" index += 1\n",
" atomID += 1\n",
"f.close()\n",
"print('File has been printed to: ',exportPath)"
]
},