...
 
Commits (6)
......@@ -65,7 +65,7 @@ The project contains the following directories:
2. `Slides`: notebooks source code for the lectures slides.
3. `Notebooks`: Exercises and notebooks.
#### How to create slides
#### How to create slides
To look at the slides execute in a shell:
```bash
......@@ -75,4 +75,9 @@ The project contains the following directories:
To create a PDF of the slides, add `?print-pdf` to the URL of the served slides, e.g.: `http://localhost:8000/[SLIDES TITLE].slides.html?print-pdf`
#### Exercises
Exercises are in the `Notebooks` directory, solutions are included. Exercise 06 is optional and instead of exercises is a step-by-step tutorial to neural networks.
\ No newline at end of file
This diff is collapsed.
......@@ -89,7 +89,7 @@
},
"source": [
"But where does the file(s) of a package actually reside? \n",
"When an import statement is done there are several paths where the package is searched for (similarly to how `PATH` or `LD_LIBRARY_PATH` search paths work for binaries and libraries)."
"When an import statement is executed there are several paths where the package is searched for (similarly to how `PATH` or `LD_LIBRARY_PATH` search paths work for binaries and libraries on linux)."
]
},
{
......@@ -261,7 +261,7 @@
}
},
"source": [
"`pip` is indeed a more modern and flexible way to interact with PyPI. It usually comes with all python distributions and thus you do not need to install. The command line utility allows for installation/removal of packages, for example to install the package `numpy` for the whole system you can do:\n",
"`pip` is a more flexible way to interact with PyPI. It usually comes with all python distributions and thus you do not need to install it. The command line utility allows for the installation/removal of packages, for example to install the package `numpy` for the whole system you can do:\n",
"```bash\n",
"#Don't do this\n",
"sudo pip install numpy\n",
......@@ -366,8 +366,8 @@
"The [Anaconda distribution](https://anaconda.org/) is maintained by a private company (Anaconda Inc.), it provides a free and open-source distribution tailored to data science. \n",
"\n",
"Similarly to pip/virtualenv it provides a package and environment manager.\n",
" * Linux, MacOS and Windows are supported \n",
" * The supported is not limited to python, but also notably R and in general any package (e.g. Qt, GCC,...)\n",
" * Linux, MacOS and Windows are all supported \n",
" * The support is not limited to python, but also to notably R and in general any binary package (e.g. Qt, GCC,...)\n",
"\n",
"<img src=\"anaconda.png\">"
]
......@@ -466,11 +466,11 @@
" * Improved command line navigation (similar to a shell/terminal)\n",
" * Syntax highlight\n",
" * Auto completion: press `Tab`-key with an incomplete word/command to see suggestions\n",
" * Call system program from interpreter with `!` (e.g.: `!pwd`). Note the form `mydir = !pwd` is supported.\n",
" * Call system program from interpreter with `!` (e.g.: `!pwd`). Note the form `mydir = !pwd`\n",
" * Improved history handling. Including: type the first characters of an old command, press `Up`-key to auto complete line to most recent matching line\n",
" * Retrieve the last computed result with `_` or with `_<N>` for output *N*\n",
" * *Magic* functions, extensions to IPython that can improve interactive sessions. Some examples:\n",
" * `%magic` help of magic subsystem itself\n",
" * `%magic` help on magic subsystem itself\n",
" * `%timeit python-code-goes-here` will time the python line, repeating it a large number of times to improve precision\n",
" * `%bookmark` create *favorite* folders to easily cd into them\n",
" * `%cd` change the current directory\n",
......@@ -572,7 +572,7 @@
"Jupyter is very popular and several ways to share notebooks exist. It should be noted that when a notebook is executed the output of code cells is stored in meta-data, thus it can be rendered:\n",
" * Gitlab and github render a notebook as expected: [example](https://gitlab.com/andreadotti/pyalghero2019/blob/master/Slides/Exercise-01-Solution.ipynb)\n",
" * They are based on [nbviewer](https://nbviewer.jupyter.org/)\n",
" * Online services provide interactive execution of notebooks on premise/cloud resources ([MyBinder](https://mybinder.org/), [Microsoft Azure](https://notebooks.azure.com/))\n",
" * Online services provide interactive execution of notebooks on premise/cloud resources ([MyBinder](https://mybinder.org/), [Microsoft Azure](https://notebooks.azure.com/), [Google Colaboratory](https://colab.research.google.com/notebooks/welcome.ipynb))\n",
" \n",
" <div class=\"myrow\">\n",
" <div class=\"mycolumn\"><img src=\"jupyter_2.png\" style=\"width:100%\"></div>\n",
......@@ -591,10 +591,17 @@
"source": [
"<img src=\"docker.png\" style=\"width:20%\">\n",
"\n",
"Sharing of notebooks often requires writing using containers.\n",
"Check out [this project](https://jupyter-docker-stacks.readthedocs.io/en/latest/index.html) if you need\n",
"Sharing of notebooks often requires writing and using containers.\n",
"Check out [this project](https://jupyter-docker-stacks.readthedocs.io/en/latest/index.html) if you need them.\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
......
......@@ -27,8 +27,8 @@
"</style>\n",
"\n",
"# Foreword\n",
"We have seen what a jupyter notebook is. \n",
"As a matter of fact these slides are a notebook converted to (HTML) slides via the `jupyter-notebook` utility. \n",
"We have seen the use of jupyter notebooks. \n",
"As a matter of fact these slides are a notebook converted to (HTML) slides via the `jupyter-nbconvert` utility. \n",
"Get the code from gitlab here: https://gitlab.com/andreadotti/pyalghero2019 with, in an new terminal:\n",
"```bash\n",
"git clone https://gitlab.com/andreadotti/pyalghero2019\n",
......@@ -87,7 +87,7 @@
" * Fourier Transforms ([`scipy.fftpack`](https://docs.scipy.org/doc/scipy/reference/tutorial/fftpack.html))\n",
" * Signal Processing ([`scipy.signal`](https://docs.scipy.org/doc/scipy/reference/tutorial/signal.html))\n",
" * Linear Algebra ([`scipy.linalg`](https://docs.scipy.org/doc/scipy/reference/tutorial/linalg.html))\n",
" * Sparse Eigenvalue Problems with [ARPACK](https://docs.scipy.org/doc/scipy/reference/tutorial/arpack.html))\n",
" * Sparse Eigenvalue Problems with [ARPACK](https://docs.scipy.org/doc/scipy/reference/tutorial/arpack.html)\n",
" * Compressed Sparse Graph Routines ([`scipy.sparse.csgraph`](https://docs.scipy.org/doc/scipy/reference/tutorial/csgraph.html))\n",
" * Spatial data structures and algorithms ([`scipy.spatial`](https://docs.scipy.org/doc/scipy/reference/tutorial/spatial.html))\n",
" * Statistics ([`scipy.stats`](https://docs.scipy.org/doc/scipy/reference/tutorial/stats.html))\n",
......@@ -116,7 +116,7 @@
"source": [
"## Numpy\n",
"It is the foundation of python scientific stack. \n",
"The basic building block is the `numpy.array` data structure. It can be often used as a python list of numbers, but it is a specialized efficient way of manipulating numbers in python."
"The basic building block is the `numpy.array` data structure. It can be used as a python list of numbers, but it is a specialized efficient way of manipulating numbers in python."
]
},
{
......@@ -1936,11 +1936,11 @@
},
{
"cell_type": "code",
"execution_count": 1,
"execution_count": 2,
"metadata": {
"ExecuteTime": {
"end_time": "2019-05-22T05:22:59.271351Z",
"start_time": "2019-05-22T05:22:58.329324Z"
"end_time": "2019-05-26T14:21:00.688791Z",
"start_time": "2019-05-26T14:21:00.671802Z"
},
"slideshow": {
"slide_type": "subslide"
......@@ -1950,15 +1950,15 @@
{
"data": {
"text/plain": [
"0 1.0\n",
"1 2.0\n",
"2 3.0\n",
"3 NaN\n",
"4 5.0\n",
"a 1.0\n",
"b 2.0\n",
"c 3.0\n",
"d NaN\n",
"e 5.0\n",
"dtype: float64"
]
},
"execution_count": 1,
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
......@@ -1967,7 +1967,7 @@
"import numpy as np\n",
"import pandas as pd\n",
"\n",
"s = pd.Series( [1., 2., 3., np.nan, 5. ])\n",
"s = pd.Series( [1., 2., 3., np.nan, 5. ], index=[\"a\",\"b\",\"c\",\"d\",\"e\"])\n",
"s"
]
},
......@@ -2063,6 +2063,46 @@
"df"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"### Reading/Saving dataframes\n",
"Pandas support reading writing to several data formats, via specialized routines, many other formats, because dataframe (with other names) are a common concept:"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "-"
}
},
"source": [
"| Format Type | Data Description | Reader | Writer |\n",
"|--------------|-------------------|------------------|-----------------|\n",
"| text | CSV | `read_csv` | `to_csv` |\n",
"| text | JSON | `read_json` | `to_json` |\n",
"| text | HTML | `read_html` | `to_html` |\n",
"| text \t | Local clipboard | `read_clipboard` | `to_clipboard ` |\n",
"| binary | MS Excel | `read_excel` | `to_excel` |\n",
"| binary | HDF5 Format | `read_hdf` |\t`to_hdf` |\n",
"| binary | Feather Format | `read_feather` |\t`to_feather` |\n",
"| binary | Parquet Format | `read_parquet` |\t`to_parquet` |\n",
"| binary | Msgpack | `read_msgpack` | `to_msgpack` |\n",
"| binary | Stata | `read_stata` | \t`to_stata` |\n",
"| binary | SAS | `read_sas` | | \t \n",
"| binary | Pickle Format | `read_pickle` |\t`to_pickle` |\n",
"| SQL \t | SQL \t | `read_sql` | `to_sql` |\n",
"| SQL \t | Google Big Query | `read_gbq` | \t`to_gbq` |\n",
"\n",
"As you can see the physicists *ROOT* format is not natively supported. However some external software to read `TTree`s are available. For example [`root_numpy`](http://scikit-hep.org/root_numpy/), [`root_pandas`](https://github.com/scikit-hep/root_pandas), or [`uproot`](https://github.com/scikit-hep/uproot). ROOT usually comes with pre-installed `pyROOT` library (the one on the provided VM works only for python2), that offers basic functionalities."
]
},
{
"cell_type": "code",
"execution_count": 12,
......