Commit ca8a4685 authored by Éamonn Murray's avatar Éamonn Murray

Fix some typos

parent da6ff2e4
...@@ -29,7 +29,7 @@ ...@@ -29,7 +29,7 @@
"source": [ "source": [
"## Linear Least Squares Fitting\n", "## Linear Least Squares Fitting\n",
"\n", "\n",
"Say we have sampled 10 points, and want to find the equation of the straight line that passes through all 10 points. The equation of a line has two parameters: $y = a_0 + a_1 x$. Each sampled data point can be used to give us an equation that could be used to find what the parameters of a straight line passing through that point are. But, we only have 2 parameters, but will obtain 10 equations from our sampled points. This means that our system is overdetermined. The chance that 10 sampled points will fall on the same line is effectively zero unless they have been chosen to do so (and so are not in fact sampled), so there is no way to solve this exactly.\n", "Say we have sampled 10 points, and want to find the equation of the straight line that passes through all 10 points. The equation of a line has two parameters: $y = a_0 + a_1 x$. Each sampled data point can be used to give us an equation that could be used to find what the parameters of a straight line passing through that point are. We only have 2 parameters, but will obtain 10 equations from our sampled points. This means that our system is overdetermined. The chance that 10 sampled points will fall on the same line is effectively zero unless they have been chosen to do so (and so are not in fact sampled), so there is no way to solve this exactly.\n",
"\n", "\n",
"Instead, we want to do the best we can and find the line that passes most closely to each of the sampled points. There are a number of ways we could choose to do this. The most commonly used approach is to find the deviation from the line at each of our sampled points, and find the set of parameters that minimize the total of the squares of these deviations, hence the term least squares fitting. This particular approach was first used by Gauss in determining the orbits of comets, and he showed the the least squares estimate coincides with maximum likelihood estimates for independent normally distributed errors.\n", "Instead, we want to do the best we can and find the line that passes most closely to each of the sampled points. There are a number of ways we could choose to do this. The most commonly used approach is to find the deviation from the line at each of our sampled points, and find the set of parameters that minimize the total of the squares of these deviations, hence the term least squares fitting. This particular approach was first used by Gauss in determining the orbits of comets, and he showed the the least squares estimate coincides with maximum likelihood estimates for independent normally distributed errors.\n",
"\n", "\n",
...@@ -52,7 +52,9 @@ ...@@ -52,7 +52,9 @@
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": null, "execution_count": null,
"metadata": {}, "metadata": {
"collapsed": true
},
"outputs": [], "outputs": [],
"source": [ "source": [
"# Let's create some M random data points as we did for the interpolation\n", "# Let's create some M random data points as we did for the interpolation\n",
...@@ -76,7 +78,7 @@ ...@@ -76,7 +78,7 @@
"$$ \\frac{\\partial D}{\\partial a_0} = 0 = 2 \\Sigma_{i=1}^M (a_0 + a_1 x_i - y_i) $$\n", "$$ \\frac{\\partial D}{\\partial a_0} = 0 = 2 \\Sigma_{i=1}^M (a_0 + a_1 x_i - y_i) $$\n",
"and\n", "and\n",
"$$ \\frac{\\partial D}{\\partial a_1} = 0 = 2 \\Sigma_{i=1}^M (a_0 + a_1 x_i - y_i) x_i. $$\n", "$$ \\frac{\\partial D}{\\partial a_1} = 0 = 2 \\Sigma_{i=1}^M (a_0 + a_1 x_i - y_i) x_i. $$\n",
"When we fill in our x and y values in the expression for $D$, so we'll have two equations in two unknowns ($a_0$ and $a_1$). This is a linear system which we have already learned how to solve.\n", "When we fill in our x and y values in the expression for $D$, we'll have two equations in two unknowns ($a_0$ and $a_1$). This is a linear system which we have already learned how to solve.\n",
"\n", "\n",
"Gathering the coefficients and simplifying (leaving out the summation limits to declutter) we get\n", "Gathering the coefficients and simplifying (leaving out the summation limits to declutter) we get\n",
"$$ M a_0 + a_1 \\Sigma x_i = \\Sigma y_i $$\n", "$$ M a_0 + a_1 \\Sigma x_i = \\Sigma y_i $$\n",
...@@ -88,7 +90,9 @@ ...@@ -88,7 +90,9 @@
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": null, "execution_count": null,
"metadata": {}, "metadata": {
"collapsed": true
},
"outputs": [], "outputs": [],
"source": [ "source": [
"# Recall we know how to solve the linear system Ax=b\n", "# Recall we know how to solve the linear system Ax=b\n",
...@@ -186,7 +190,9 @@ ...@@ -186,7 +190,9 @@
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": null, "execution_count": null,
"metadata": {}, "metadata": {
"collapsed": true
},
"outputs": [], "outputs": [],
"source": [ "source": [
"# We need to first get our data in the form expected by our function.\n", "# We need to first get our data in the form expected by our function.\n",
...@@ -316,7 +322,9 @@ ...@@ -316,7 +322,9 @@
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": null, "execution_count": null,
"metadata": {}, "metadata": {
"collapsed": true
},
"outputs": [], "outputs": [],
"source": [ "source": [
"# Let's reproduce our fit with np.polyfit.\n", "# Let's reproduce our fit with np.polyfit.\n",
...@@ -421,7 +429,9 @@ ...@@ -421,7 +429,9 @@
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": null, "execution_count": null,
"metadata": {}, "metadata": {
"collapsed": true
},
"outputs": [], "outputs": [],
"source": [ "source": [
"# First let's load our data and plot it.\n", "# First let's load our data and plot it.\n",
...@@ -452,7 +462,9 @@ ...@@ -452,7 +462,9 @@
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": null, "execution_count": null,
"metadata": {}, "metadata": {
"collapsed": true
},
"outputs": [], "outputs": [],
"source": [ "source": [
"# We'll need some initial guesses for the fit.\n", "# We'll need some initial guesses for the fit.\n",
...@@ -472,7 +484,9 @@ ...@@ -472,7 +484,9 @@
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": null, "execution_count": null,
"metadata": {}, "metadata": {
"collapsed": true
},
"outputs": [], "outputs": [],
"source": [ "source": [
"xplt = np.linspace(si_EvV[:, 0].min(), si_EvV[:, 0].max(), 100)\n", "xplt = np.linspace(si_EvV[:, 0].min(), si_EvV[:, 0].max(), 100)\n",
...@@ -514,7 +528,9 @@ ...@@ -514,7 +528,9 @@
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": null, "execution_count": null,
"metadata": {}, "metadata": {
"collapsed": true
},
"outputs": [], "outputs": [],
"source": [ "source": [
"def multi_f(x):\n", "def multi_f(x):\n",
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment