Commit f417b90d by Éamonn Murray

 { "cells": [ { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# I'll be importing these at the start of all the course notebooks.\n", "# If you copy a function definition for one of your own codes, don't\n", "# forget it likely relies on one or more of these.\n", "import numpy as np\n", "import matplotlib.pyplot as plt\n", "\n", "# We'll be using several functions from scipy.optimize here.\n", "import scipy.optimize" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Finding Roots\n", "\n", "Often we will come across some complex expression and want to find where it attains a particular value. While the methods and examples I'll be showing are for finding where a function returns zero, you can find where a function $f(x)$ returns an arbitrary value $t$ by finding where $g(x) = f(x) - t = 0$." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## The Bisection Method\n", "\n", "This is one of the simplest and most robust methods to find the root of a function, and can be useful when dealing with functions which may not have continuous first derivatives. Convergence typically requires many more iterations than more advanced methods.\n", "\n", "Say we have a continuous function and want to find $f(a) = 0$, and we already have $x_0$ and $x_1$ such that $f(x_1)$ and $f(x_0)$ have opposite sign. Then we know there exists $f(a) = 0$ for $x_0 < a < x_1$.\n", "\n", "From here, we pick $x_2 = \\frac{x_0 + x_1}{2}$ and test the sign of $f(x_2)$. If $f(x_2)$ is of opposite sign to $f(x_0)$, then we set our new interval which contains $a$ to $x_0 < a < x_2$, while if $f(x_2)$ is of opposite sign to $f(x_1)$ our new interval is $x_2 < a < x_1$. Either way, we have halved the interval where we know $a$ exists.\n", "\n", "We can repeat this process until we know $a$ to within some arbitrary precision.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### An Example\n", "\n", "Let's say we have an interatomic potential given by\n", "$$\\Phi_{12}(r) = \\frac{A}{r^{12}} - \\frac{B}{r^6},$$\n", "with $A = 1.209\\times10^5\\ \\textrm{eV Angstrom}^{12}$ and $B = 70.66\\ \\textrm{eV Angstrom}^6$ and we want to find the value of $r$ where $\\Phi_{12}(r) = 0$.\n", "\n", "We first need to construct our function:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "def Phi12(r):\n", " '''Lennard-Jones Potential for Argon (eV with r in Angstrom)'''\n", " return 1.209E5*r**(-12) - 70.66*r**(-6)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "First it's always useful to plot the the function to get an idea of how it looks." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "xvals = np.linspace(1, 5, 100) # 100 points between 1 and 5\n", "plt.plot(xvals, Phi12(xvals))\n", "plt.xlabel('r')\n", "plt.ylabel('Phi12(r)')\n", "plt.grid(True)\n", "plt.show()\n", "\n", "# It will likely be hard to see features at this scale.\n", "# We can use plt.axis([x1, x2, y1, y2]) before the\n", "# plt.show() command to set the axes limits." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "By zooming in a little bit we can see there's a root between $r = 3.0$ and $r = 5.0$.\n", "\n", "Let's confirm the sign of our function is different at these two values:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "print(Phi12(3.0), Phi12(5.0))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now lets make a function that will use the bisection method to find the value of $r$ to within some tolerance." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# We'll use a scipy style docstring. This is a good habit to get into\n", "# as it'll let you use \"help(function_name)\" to get detailed information\n", "# on what the function does, what input it expects, and output it\n", "# generates.\n", "\n", "def bisection(func, x0, x1, maxiter, tol):\n", " '''\n", " Find the root of a function with the bisection method.\n", "\n", " Parameters\n", " ----------\n", " func : function\n", " A Python function we want to find a root of.\n", " x0, x1 : float\n", " The bounding values of the region to find the root.\n", " maxiter : integer\n", " The maximum number of iterations.\n", " tol : float\n", " Exit when the root is known within this tolerance\n", " \n", " Returns\n", " -------\n", " x : float\n", " Root of the function between x0 and x1.\n", " '''\n", " f0 = func(x0)\n", " f1 = func(x1)\n", " sign_f0 = f0 / abs(f0)\n", " sign_f1 = f1 / abs(f1)\n", " # Lets be sure we have valid x0 and x1 before proceeding.\n", " # We can use assert to raise an error if a condition isn't met.\n", " assert sign_f0 != sign_f1, \"Error: func(x0) and func(x1) do not have opposite sign.\"\n", " \n", " # It's usually a good idea to enforce a limit on how many iterations a loop can\n", " # do. Otherwise if something goes wrong, it may take some time to understand that\n", " # your code is not converging.\n", " for i in range(maxiter):\n", " x = 0.5 * (x0 + x1)\n", " # Exit when we know x within tol\n", " if abs(x-x0) < tol:\n", " # You can print out some info if you like. This is just for information\n", " # in this example. You may not want output from functions cluttering a\n", " # bigger code, and this will interfere with the use of %timeit\n", " print(\"bisection converged in\", i, \"iterations.\")\n", " return x\n", "\n", " fx = func(x)\n", " sign_fx = fx / abs(fx)\n", " # Half the range as appropriate before the next iteration\n", " if sign_fx == sign_f0:\n", " x0 = x\n", " else:\n", " x1 = x\n", "\n", " # If we get to here without returning a value, we haven't converged.\n", " # \"raise\" is another good way to get your code to exit with an informative error if\n", " # an input is not correct. There's a list of exceptions that can be raised at\n", " # https://docs.python.org/3.6/library/exceptions.html\n", " raise ValueError(\"Error: bisection failed to converge after max iterations.\") " ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "help(bisection)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "bisection(func=Phi12, x0=3.0, x1=5.0, maxiter=100, tol=1e-10)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# We can use '_' to use the output from the last evaluated cell\n", "Phi12(_)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Doing this with SciPy\n", "\n", "Most numerical methods you might want to use have been implemented by someone else as a library or python module already. Python modules for potentially intensive calculations tend to be implemented in something that will run more quickly than native python, such as Fortran or C, so it is good to make use of them. Additionally libraries and modules are likely to be better optimized than code you are likely to write yourself.\n", "\n", "SciPy has an implementation of the bisection method in [scipy.optimize.bisect](https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.bisect.html)." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# Uncomment the following for the internal help information on the function\n", "# help(scipy.optimize.bisect)\n", "\n", "scipy.optimize.bisect(f=Phi12, a=3.0, b=5.0, xtol=1e-10)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "Phi12(_)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "- Try using the %timeit magic to compare the scipy bisection function with the one implemented above. \n", " - Be sure to comment out the print statement in the function or %timeit will produce a lot of output." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Newton's Method\n", "\n", "This is typically quite a fast converging method, but you need to the know the function analytically. It should have a continuous first derivative, and you need a fairly good initial guess if you want to converge to the desired root.\n", "\n", "This works by using a linear expansion of the function using it's derivative at an initial guess to improve upon that guess as shown schematically in the image below:\n", "\n", "![Schematic of Newtons Method](images/newton.png)\n", "\n", "We can express this mathematically as\n", "$$x_{i+1} = x_i - \\frac{f(x_i)}{f'(x_i)}$$\n", "which we can use to make a function that will iteratively find a root.\n", "\n", "Let's make a function that employs Newton's method to we can use to find the root of $\\Phi_{12}$ as above." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "def newton(func, dfunc, x0, maxiter, tol):\n", " '''\n", " Find the root of a function using Newton's method.\n", "\n", " Parameters\n", " ----------\n", " func : function\n", " A Python function we want to find a root of.\n", " dfunc : function\n", " A Python function returning the derivative of func.\n", " x0 : float\n", " Initial guess at a root.\n", " maxiter : integer\n", " The maximum number of iterations.\n", " tol : float\n", " Exit when the root is known within this tolerance\n", "\n", " Returns\n", " -------\n", " x : float\n", " Root of the function.\n", " '''\n", "\n", " for i in range(maxiter):\n", " x = x0 - func(x0)/dfunc(x0)\n", " # Exit when we know x within tol\n", " if abs(x-x0) < tol:\n", " # You can print out some info if you like. This is optional, and you may\n", " # not want output from functions cluttering a bigger code.\n", " print(\"newton converged in\", i, \"iterations.\")\n", " return x\n", " x0 = x\n", " \n", " # If we get to here without returning a value, we haven't converged.\n", " raise ValueError(\"Error: newton failed to converge after max iterations.\") " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We'll also need to define a function for the derivative:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "def DPhi12(r):\n", " '''Derivative of Lennard-Jones Potential for Argon (eV with r in Angstrom)'''\n", " return -12*1.209E5*r**(-13) + 6*70.66*r**(-7)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now let's find our root, starting from 2.0" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true, "scrolled": true }, "outputs": [], "source": [ "newton(func=Phi12, dfunc=DPhi12, x0=2.0, maxiter=100, tol=1e-10)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As you can see, this took significantly fewer iterations to obtain the same accuracy.\n", "\n", "But this method is a bit more sensitive - for example if we take a somewhat higher initial guess we quickly run into trouble:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true, "scrolled": true }, "outputs": [], "source": [ "newton(func=Phi12, dfunc=DPhi12, x0=4.0, maxiter=100, tol=1e-10)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Doing this with SciPy\n", "\n", "Again, SciPy has an implementation of this we can use as [scipy.optimize.newton](https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.newton.html)." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# Uncomment the following for the internal help information on the function\n", "# help(scipy.optimize.newton)\n", "\n", "scipy.optimize.newton(func=Phi12, x0=3.0, fprime=DPhi12, tol=1e-10)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## The Secant Method\n", "\n", "It can often be tricky to generate a new function that accurately returns the derivative of another function. To get around this we can use the secant method.\n", "\n", "This works like Newton's method, but we don't need to have an analytic function for the derivative. Instead we approximate it from two previous guesses as\n", "\n", "$$f'(x_i) \\approx \\frac{f(x_i) - f(x_{i-1})}{x_i - x_{i-1}}$$\n", "\n", "which we can combine with our expression for Newton's method to get\n", "\n", "$$x_{i+1} = x_i - \\frac{f(x_i)(x_i - x_{i-1})}{f(x_i) - f(x_{i-1})} = \\frac{x_{i-1}f(x_i) - x_{i}f(x_{i-1})}{f(x_i) - f(x_{i-1})}$$" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "def secant(func, x0, x1, maxiter, tol):\n", " '''\n", " Use the secant method to find a root of a function\n", " \n", " Parameters\n", " ----------\n", " func : function\n", " A Python function we want to find a root of.\n", " x0 : float\n", " Initial guess at a root.\n", " x1 : float\n", " A second guess at a root.\n", " maxiter : integer\n", " The maximum number of iterations.\n", " tol : float\n", " Exit when the root is known within this tolerance\n", "\n", " Returns\n", " -------\n", " x : float\n", " Root of the function.\n", " '''\n", "\n", " f0 = func(x0)\n", " for i in range(maxiter):\n", " f1 = func(x1)\n", " x = (x0*f1 - x1*f0)/(f1 - f0)\n", " # Exit when we know x within tol\n", " if abs(x-x1) < tol:\n", " # You can print out some info if you like. This is optional, and you may\n", " # not want output from functions cluttering a bigger code.\n", " print(\"secant converged in\", i, \"iterations.\")\n", " return x\n", " x0 = x1\n", " x1 = x\n", " f0 = f1\n", "\n", " # If we get to here without returning a value, we haven't converged.\n", " raise ValueError(\"Error: secant failed to converge after max iterations.\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "secant(func=Phi12, x0=3.0, x1=3.1, maxiter=100, tol=1e-10)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# It doesn't matter which of x0 or x1 is a better guess\n", "secant(func=Phi12, x0=3.1, x1=3.0, maxiter=100, tol=1e-10)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# It is still sensitive to the choice of initial guess though\n", "secant(func=Phi12, x0=3.0, x1=3.9, maxiter=100, tol=1e-10)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Doing this with SciPy\n", "\n", "We can use the same [scipy.optimize.newton](https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.newton.html) function as above to use the secant method. This method is used automatically when a derivative function is not provided. Note that the scipy version does not require a second initial guess, it automatically generates the second value by adding a small amount to the provided initial guess." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "scipy.optimize.newton(func=Phi12, x0=3.0, tol=1e-10)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Brent's Methods\n", "\n", "Brent's method uses a combination of ideas from the approaches we discussed above. It decides on each iteration whether it's best to proceed with\n", "- a step of bisection\n", "- a step of secant\n", "- or a step of inverse quadratic interpolation\n", " - This is like the secant method, but uses results from three values to approximate a quadratic rather than a line.\n", "\n", "This gives you the advantages of each method and avoids their drawbacks to provide a robust and fast converging method to obtain a root in a bracketed interval.\n", "\n", "It's available in SciPy as [scipy.optimize.brentq](https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.brentq.html). **This is likely the best choice of approach available when trying to find a root with Python**." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "scipy.optimize.brentq(f=Phi12, a=3.0, b=5.0, xtol=1e-10, full_output=True)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "%timeit scipy.optimize.brentq(f=Phi12, a=2.0, b=5.0, xtol=1e-10)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "help(scipy.optimize.brentq)" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.3" } }, "nbformat": 4, "nbformat_minor": 2 }
 { "cells": [ { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "import numpy as np\n", "import matplotlib.pyplot as plt\n", "\n", "# We'll be using several functions from scipy.linalg here.\n", "import scipy.linalg" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Solving Systems of Equations\n", "\n", "Let's start with a simple example, where we want to solve the following equations simultaneously:\n", "\n", "$$\n", "3x_1 + 5x_2 + 2x_3 = 11\\\\\n", "2x_1 - 3x_2 - 3x_3 = -1\\\\\n", "x_1 + x_2 + x_3 = 2\n", "$$\n", "\n", "This would be equivalent to solving the matrix equation\n", "\n", "$$\n", "\\mathbf{Mx}=\\mathbf{b}\n", "$$\n", "where\n", "$$\n", "\\mathbf{M} = \\begin{pmatrix} 3 & 5 & 2 \\\\ 2 & -3 & -3 \\\\ 1 & 1 & 1 \\end{pmatrix},\\quad \n", "\\mathbf{x} = \\begin{pmatrix} x_1 \\\\ x_2 \\\\ x_3 \\end{pmatrix}, \\quad\n", "\\mathbf{b} = \\begin{pmatrix} 11 \\\\ -1 \\\\ 2 \\end{pmatrix}.\n", "$$\n", "\n", "In this case of a system with just three equations, this is not difficult to do by hand, but often we'll need to be able to solve systems with an arbitrarily large number of equations and variables.\n", "\n", "More generally a system of $n$ equations of $n$ unknowns, such as\n", "\n", "$$\n", "m_{11} x_1 + m_{12} x_2 + \\cdots + m_{1n}x_n = b_1 \\\\\n", "m_{21} x_1 + m_{22} x_2 + \\cdots + m_{2n}x_n = b_2 \\\\\n", "\\qquad \\vdots \\qquad \\qquad \\qquad \\qquad \\vdots \\\\\n", "m_{n1} x_1 + m_{n2} x_2 + \\cdots + m_{nn}x_n = b_n\n", "$$\n", "would be equivalent to solving the matrix equation\n", "\n", "$$\n", "\\mathbf{Mx}=\\mathbf{b}\n", "$$\n", "where\n", "$$\n", "\\mathbf{M} = \\begin{pmatrix}m_{11} & m_{12} & \\cdots & m_{1n} \\\\\n", " m_{21} & m_{22} & \\cdots & m_{2n} \\\\\n", " \\vdots & \\vdots & \\ddots & \\vdots \\\\\n", " m_{n1} & m_{n2} & \\cdots & m_{nn}\n", " \\end{pmatrix},\\quad \n", "\\mathbf{x} = \\begin{pmatrix} x_1 \\\\ x_2 \\\\ \\vdots \\\\ x_n \\end{pmatrix}, \\quad\n", "\\mathbf{b} = \\begin{pmatrix} b_1 \\\\ b_2 \\\\ \\vdots \\\\ b_n \\end{pmatrix}.\n", "$$\n", "\n", "So we need a systematic way to do this.\n", "\n", "When might it not be possible to do this?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Gaussian Elimination\n", "\n", "The usual approach to solving this kind of linear system is to use an algorithm called Gaussian Elimination. This can be done with the following procedure\n", "\n", "- Combine the solution vector with matrix of parameters to form the augmented matrix $\\mathbf{\\tilde{M}}$ as\n", "\n", "$$\\mathbf{\\tilde{M}} = \\begin{pmatrix}m_{11} & m_{12} & \\cdots & m_{1n} & b_1\\\\\n", " m_{21} & m_{22} & \\cdots & m_{2n} & b_2\\\\\n", " \\vdots & \\vdots & \\ddots & \\vdots & \\vdots\\\\\n", " m_{n1} & m_{n2} & \\cdots & m_{nn} & b_n \\end{pmatrix}$$\n", " \n", "- Perform row reduction on this matrix such that $\\mathbf{M}$ is transformed to an upper triangular form:\n", "\n", "$$\\mathbf{\\tilde{M}'} = \\begin{pmatrix}m_{11}' & m_{12}' & \\cdots & m_{1n}' & b_1'\\\\\n", " 0 & m_{22}' & \\cdots & m_{2n}' & b_2'\\\\\n", " \\vdots & \\vdots & \\ddots & \\vdots & \\vdots\\\\\n", " 0 & 0 & \\cdots & m_{nn}' & b_n' \\end{pmatrix}$$\n", "\n", "- Row reduction consists of the following operations, which are repeated as necessary.\n", " 1. Multiply any row by a constant.\n", " 2. Add any row to another row.\n", " 3. Swap the order of any two rows.\n", "\n", "Let's do this now for the simple example we gave above:\n", "\n", "We start with\n", "$$\n", "\\mathbf{\\tilde{M}}_0 = \\begin{pmatrix} 3 & 5 & 2 & 11\\\\ 2 & -3 & -3 & -1\\\\ 1 & 1 & 1 & 2\\end{pmatrix}\n", "$$\n", "\n", "We want to have zeros below some diagonal elements, so let's use the following procedure:\n", "- Given a diagonal element $m_{ii}$, to set element $m_{jk}$ to zero where $j>i$ (row) and \$k