"# I'll be importing these at the start of all the course notebooks.\n",
"# If you copy a function definition for one of your own codes, don't\n",
"# forget it likely relies on one or more of these.\n",
"import numpy as np\n",
"import matplotlib.pyplot as plt\n",
"\n",
"# We'll be using several functions from scipy.optimize here.\n",
"import scipy.optimize"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Finding Roots\n",
"\n",
"Often we will come across some complex expression and want to find where it attains a particular value. While the methods and examples I'll be showing are for finding where a function returns zero, you can find where a function $f(x)$ returns an arbitrary value $t$ by finding where $g(x) = f(x) - t = 0$."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## The Bisection Method\n",
"\n",
"This is one of the simplest and most robust methods to find the root of a function, and can be useful when dealing with functions which may not have continuous first derivatives. Convergence typically requires many more iterations than more advanced methods.\n",
"\n",
"Say we have a continuous function and want to find $f(a) = 0$, and we already have $x_0$ and $x_1$ such that $f(x_1)$ and $f(x_0)$ have opposite sign. Then we know there exists $f(a) = 0$ for $x_0 < a < x_1$.\n",
"\n",
"From here, we pick $x_2 = \\frac{x_0 + x_1}{2}$ and test the sign of $f(x_2)$. If $f(x_2)$ is of opposite sign to $f(x_0)$, then we set our new interval which contains $a$ to $x_0 < a < x_2$, while if $f(x_2)$ is of opposite sign to $f(x_1)$ our new interval is $x_2 < a < x_1$. Either way, we have halved the interval where we know $a$ exists.\n",
"\n",
"We can repeat this process until we know $a$ to within some arbitrary precision.\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### An Example\n",
"\n",
"Let's say we have an interatomic potential given by\n",
"with $A = 1.209\\times10^5\\ \\textrm{eV Angstrom}^{12}$ and $B = 70.66\\ \\textrm{eV Angstrom}^6$ and we want to find the value of $r$ where $\\Phi_{12}(r) = 0$.\n",
"\n",
"We first need to construct our function:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"def Phi12(r):\n",
" '''Lennard-Jones Potential for Argon (eV with r in Angstrom)'''\n",
" return 1.209E5*r**(-12) - 70.66*r**(-6)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"First it's always useful to plot the the function to get an idea of how it looks."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"xvals = np.linspace(1, 5, 100) # 100 points between 1 and 5\n",
"plt.plot(xvals, Phi12(xvals))\n",
"plt.xlabel('r')\n",
"plt.ylabel('Phi12(r)')\n",
"plt.grid(True)\n",
"plt.show()\n",
"\n",
"# It will likely be hard to see features at this scale.\n",
"# We can use plt.axis([x1, x2, y1, y2]) before the\n",
"# plt.show() command to set the axes limits."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"By zooming in a little bit we can see there's a root between $r = 3.0$ and $r = 5.0$.\n",
"\n",
"Let's confirm the sign of our function is different at these two values:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"print(Phi12(3.0), Phi12(5.0))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now lets make a function that will use the bisection method to find the value of $r$ to within some tolerance."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"# We'll use a scipy style docstring. This is a good habit to get into\n",
"# as it'll let you use \"help(function_name)\" to get detailed information\n",
"# on what the function does, what input it expects, and output it\n",
"# generates.\n",
"\n",
"def bisection(func, x0, x1, maxiter, tol):\n",
" '''\n",
" Find the root of a function with the bisection method.\n",
"\n",
" Parameters\n",
" ----------\n",
" func : function\n",
" A Python function we want to find a root of.\n",
" x0, x1 : float\n",
" The bounding values of the region to find the root.\n",
" maxiter : integer\n",
" The maximum number of iterations.\n",
" tol : float\n",
" Exit when the root is known within this tolerance\n",
" \n",
" Returns\n",
" -------\n",
" x : float\n",
" Root of the function between x0 and x1.\n",
" '''\n",
" f0 = func(x0)\n",
" f1 = func(x1)\n",
" sign_f0 = f0 / abs(f0)\n",
" sign_f1 = f1 / abs(f1)\n",
" # Lets be sure we have valid x0 and x1 before proceeding.\n",
" # We can use assert to raise an error if a condition isn't met.\n",
" assert sign_f0 != sign_f1, \"Error: func(x0) and func(x1) do not have opposite sign.\"\n",
" \n",
" # It's usually a good idea to enforce a limit on how many iterations a loop can\n",
" # do. Otherwise if something goes wrong, it may take some time to understand that\n",
" # your code is not converging.\n",
" for i in range(maxiter):\n",
" x = 0.5 * (x0 + x1)\n",
" # Exit when we know x within tol\n",
" if abs(x-x0) < tol:\n",
" # You can print out some info if you like. This is just for information\n",
" # in this example. You may not want output from functions cluttering a\n",
" # bigger code, and this will interfere with the use of %timeit\n",
" print(\"bisection converged in\", i, \"iterations.\")\n",
" return x\n",
"\n",
" fx = func(x)\n",
" sign_fx = fx / abs(fx)\n",
" # Half the range as appropriate before the next iteration\n",
" if sign_fx == sign_f0:\n",
" x0 = x\n",
" else:\n",
" x1 = x\n",
"\n",
" # If we get to here without returning a value, we haven't converged.\n",
" # \"raise\" is another good way to get your code to exit with an informative error if\n",
" # an input is not correct. There's a list of exceptions that can be raised at\n",
"# We can use '_' to use the output from the last evaluated cell\n",
"Phi12(_)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Doing this with SciPy\n",
"\n",
"Most numerical methods you might want to use have been implemented by someone else as a library or python module already. Python modules for potentially intensive calculations tend to be implemented in something that will run more quickly than native python, such as Fortran or C, so it is good to make use of them. Additionally libraries and modules are likely to be better optimized than code you are likely to write yourself.\n",
"\n",
"SciPy has an implementation of the bisection method in [`scipy.optimize.bisect`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.bisect.html)."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"# Uncomment the following for the internal help information on the function\n",
"- Try using the `%timeit` magic to compare the scipy bisection function with the one implemented above. \n",
" - Be sure to comment out the print statement in the function or `%timeit` will produce a lot of output."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Newton's Method\n",
"\n",
"This is typically quite a fast converging method, but you need to the know the function analytically. It should have a continuous first derivative, and you need a fairly good initial guess if you want to converge to the desired root.\n",
"\n",
"This works by using a linear expansion of the function using it's derivative at an initial guess to improve upon that guess as shown schematically in the image below:\n",
"\n",
"![Schematic of Newtons Method](images/newton.png)\n",
"Again, SciPy has an implementation of this we can use as [`scipy.optimize.newton`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.newton.html)."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"# Uncomment the following for the internal help information on the function\n",
"It can often be tricky to generate a new function that accurately returns the derivative of another function. To get around this we can use the secant method.\n",
"\n",
"This works like Newton's method, but we don't need to have an analytic function for the derivative. Instead we approximate it from two previous guesses as\n",
"We can use the same [`scipy.optimize.newton`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.newton.html) function as above to use the secant method. This method is used automatically when a derivative function is not provided. Note that the scipy version does not require a second initial guess, it automatically generates the second value by adding a small amount to the provided initial guess."
"Brent's method uses a combination of ideas from the approaches we discussed above. It decides on each iteration whether it's best to proceed with\n",
"- a step of bisection\n",
"- a step of secant\n",
"- or a step of inverse quadratic interpolation\n",
" - This is like the secant method, but uses results from three values to approximate a quadratic rather than a line.\n",
"\n",
"This gives you the advantages of each method and avoids their drawbacks to provide a robust and fast converging method to obtain a root in a bracketed interval.\n",
"\n",
"It's available in SciPy as [`scipy.optimize.brentq`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.brentq.html). **This is likely the best choice of approach available when trying to find a root with Python**."
"In this case of a system with just three equations, this is not difficult to do by hand, but often we'll need to be able to solve systems with an arbitrarily large number of equations and variables.\n",
"\n",
"More generally a system of $n$ equations of $n$ unknowns, such as\n",
"The usual approach to solving this kind of linear system is to use an algorithm called Gaussian Elimination. This can be done with the following procedure\n",
"\n",
"- Combine the solution vector with matrix of parameters to form the augmented matrix $\\mathbf{\\tilde{M}}$ as\n",
"We want to have zeros below some diagonal elements, so let's use the following procedure:\n",
"- Given a diagonal element $m_{ii}$, to set element $m_{jk}$ to zero where $j>i$ (row) and $k<i$ (column), multiply row $i$ by $-m_{jk}/m_{ii}$ and add it to row $j$.\n",
"\n",
"Here we want to have zeros below the 3 in the first row. In the second row the value in that spot is 2, so we can multiply the first row by -2/3 and add it to the second to give a zero in that position. And similarly, we can multiply the first row by -1/3 and add it to the third to get a zero in that position in the third row:\n",