index.md 8.1 KB
Newer Older
1
2
3
---
hideToc: true
---
Klaus Strohmenger's avatar
Klaus Strohmenger committed
4

5
## Announcements:
Klaus Strohmenger's avatar
Klaus Strohmenger committed
6

7
8
* July 2019: Call for abstract has started for the **1st Berlin Conference on Data Science Education**. For more information click [here](conference.md)
* April 2019: Workshop on **23th** of **Mai** **2019**. For more information click [here](workshop.md)
Klaus Strohmenger's avatar
Klaus Strohmenger committed
9

Klaus Strohmenger's avatar
Klaus Strohmenger committed
10
## Purpose of this Website
Christoph Jansen's avatar
init  
Christoph Jansen committed
11

Klaus Strohmenger's avatar
Klaus Strohmenger committed
12
The goal of [deep.TEACHING](https://www.htw-berlin.de/forschung/online-forschungskatalog/projekte/projekt/?eid=2481) is to improve the qualification of students at HTW Berlin - University of Applied Sciences, partnering companies, as well as external users and students in the machine learning domain. The project aims at providing suitable teaching materials for bachelors and masters programs to impart a theoretical foundation and practical experience.
Christoph Jansen's avatar
init  
Christoph Jansen committed
13

Klaus Strohmenger's avatar
Klaus Strohmenger committed
14
The [project website](https://www.deep-teaching.org/) provides statically rendered HTML sites to preview the educational materials. In order to interact with notebooks and to run code, clone the [educational-materials](https://gitlab.com/deep.TEACHING/educational-materials) Git repository to your own computer (see [How to use deep.TEACHING Notebooks](#how-to-use-deepteaching-notebooks) for detailed instructions).
Christoph Jansen's avatar
init  
Christoph Jansen committed
15

16

Christoph Jansen's avatar
init  
Christoph Jansen committed
17

Klaus Strohmenger's avatar
Klaus Strohmenger committed
18
19
20
## Teaching Material

All teaching material is contextually grouped in one or more scenarios and / or courses and can be found at [Scenarios & Courses](scenarios-courses.md).
Christoph Jansen's avatar
init  
Christoph Jansen committed
21

Klaus Strohmenger's avatar
Klaus Strohmenger committed
22
All code samples and exercises are written in Python, the most commonly used programming language in this domain. Code and documentation are embedded in [Jupyter](http://jupyter.org/) notebooks to provide an interactive and explorative environment. Jupyter notebooks store source code and markdown formatted documentation in executable *cells*. This approach allows us to create a narrative by splitting up complex algorithms into small, digestable pieces. We recommend [JupyterLab](http://jupyterlab.readthedocs.io/en/stable/getting_started/overview.html), a web-based data science IDE, to work with Jupyter notebooks.
23

Klaus Strohmenger's avatar
Klaus Strohmenger committed
24
25
26
### Scenarios

We are developing three real-world scenarios, Medical Image Classification, Robotic / Autonomous Driving and Text Information Extraction, to focus on relevant educational materials.
Christoph Jansen's avatar
init  
Christoph Jansen committed
27

28
The following three pages provide a compact introduction to each of the scenarios. Get started with these introductions and choose a topic you are interested in.
Christoph Jansen's avatar
init  
Christoph Jansen committed
29

30
31
* [Medical Image Classification](courses/medical-image-classification.md)
* [Robotic Autonomous Driving](courses/robotic-autonomous-driving.md)
32
* [Natural Language Processing](courses/natural-language-processing.md)
Christoph Jansen's avatar
init  
Christoph Jansen committed
33
34


Klaus Strohmenger's avatar
Klaus Strohmenger committed
35
36
37
38
39
40
41
42
43
44
45
46

### Courses

A course is defined as a set of notebooks, which have something in common, e.g:

* The notebooks in one course develope your understanding of more complex algorithms step-by-step (e.g. from linear regression to logistic regression and then to deep neural networks).
* The notebooks in one course might be about a certain algorithm (e.g. logistic regression) and how to implement it using different librarys (e.g. numpy, tensorflow, pytorch, pymc).
* The notebooks in one course are about different algorithms, but all are to be implemented using the same library.



### Solutions
47
48
49

Some of the provided notebooks contain exercises without any solutions. If you are a teacher, you can request access to the solutions in a [private repository](https://gitlab.com/deep.TEACHING/educational-materials-solutions).

50
51
### Feedback for Self-Study

52
Although the sample solutions will only be accessable for teachers after request, we try to either include software tests, which test your final solution value with the true solution, or include pictures how the visualization of your results should look like (e.g. decreasing error while training or final decision boundary of a classifier). In the case of software tests your solution will be rounded and hashed and then compared with the corrct solution, so you cannot unintendedly spoil your self when looking at the software test. The following code snipped illustrates the hashing process:
53

54
```python  
55
56
57
58
59
60
### Exercise, Your Implementation
your_result = your_implemented_function(x,y)

### If your Implementation is correct, the assert below must not throw an exception
assert round_and_hash(your_result) == '52ca17a7de673a7e78903f6a8ea91a0c'
```
61

Klaus Strohmenger's avatar
Klaus Strohmenger committed
62

Klaus Strohmenger's avatar
Klaus Strohmenger committed
63
64
65
66
67
68
### Erros, Misspellings and Bugs

If you find any mistakes, also misspelled words, we would be very grateful if you could open an issue [here](https://gitlab.com/deep.TEACHING/educational-materials/issues).



Klaus Strohmenger's avatar
Klaus Strohmenger committed
69
70
71
72
73
74
## Questions?

If you have questions, please open a [GitLab issue](https://gitlab.com/deep.TEACHING/educational-materials/issues).



75
76
## Reviews

Klaus Strohmenger's avatar
Klaus Strohmenger committed
77
All notebooks will go through a two-tier review process. The first stage is called *Minor Review* and is usually carried out by a research assistant in the deep.TEACHING team. The secong stage is a *Major Review*, carried out by a more experienced machine learning expert or professor. The review state of each notebook can be inspected in the extensive [notebooks review list](./review-list.md) and a detailed description of the two-tier process can be found in the [Review Guide](review-guide.md).
78
79


Christoph Jansen's avatar
init  
Christoph Jansen committed
80
81
## How to use deep.TEACHING Notebooks

Klaus Strohmenger's avatar
Klaus Strohmenger committed
82
This tutorial explains how to use the Jupyter notebook teaching materials on your own computer. We recommend using Python 3.6 on a Linux distribution like Fedora 27 or Ubuntu 18.04.
Christoph Jansen's avatar
init  
Christoph Jansen committed
83
84


85
86
87
88
89
### System Packages

First install `python3`, `python3-pip`, `git` and `graphviz` on your computer.

On **Fedora** run:
Christoph Jansen's avatar
init  
Christoph Jansen committed
90
91

```bash
92
sudo dnf install python3-pip git graphviz
Christoph Jansen's avatar
init  
Christoph Jansen committed
93
94
```

95
96
97
98
99
100
101
102
103
On **Ubuntu** run:

```bash
sudo apt update
sudo apt install python3-pip git graphviz
```

If the software repositories of your prefered Linux distribution does not provide Python 3.6 consider using [pyenv](https://github.com/pyenv/pyenv). We are using [pipenv](https://github.com/pypa/pipenv) for dependency resolution, which will make use of `pyenv` automatically, if installed properly.

104
If you are on **Windows** follow the [Windows Setup](windows-setup.md) instructions.
105
106
107


### Python Packages
Christoph Jansen's avatar
init  
Christoph Jansen committed
108

109
After installing the system packages, use the following commands to get up and running.
Christoph Jansen's avatar
init  
Christoph Jansen committed
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131

```bash
# Install pipenv and jupyterlab in user's home directory
pip3 install --user pipenv jupyterlab
# Clone repository via git
git clone https://gitlab.com/deep.TEACHING/educational-materials.git
cd educational-materials
# Create a new virtual environment and install dependencies from Pipfile.lock
pipenv install
# Create an ipython kernel for the virtual environment
pipenv run ipython kernel install --user --name deep_teaching_kernel
# Run Jupyter Lab to navigate through materials
jupyter lab
# When opening a notebook with Jupyter Lab, select deep_teaching_kernel (upper right corner)
```

Using `pipenv` (and `pyenv`) ensures a reproducible setup across platforms and protects us from breaking changes in our Python
dependencies.


## Developer Documentation

132
* [Developer Guide](developer-guide.md)
Christoph Jansen's avatar
init  
Christoph Jansen committed
133
134
135
136
137
* Python Module: [Deep Teaching Commons](https://gitlab.com/deep.TEACHING/deep-teaching-commons)


### Contributing

138
We would like to build a community of contributors around this project, to collect and improve teaching materials in the machine learning domain. Pull requests are accepted via https://gitlab.com/deep.TEACHING. Please take a look at the [Developer Guide](developer-guide.md) to ensure a consistent quality of the materials.
Christoph Jansen's avatar
init  
Christoph Jansen committed
139
140
141
142
143
144
145
146


## Licensing

All materials collected in this repository are meant to be easily accessible and can be used, shared and edited by everyone.

**Notebooks**: Each Juypter notebook contains an individual *License* header to honor the authors of the corresponding notebook. All notebooks are distributed under a CC-BY-SA 4.0 license. This applies to an entire notebook, including code cells, but excluding any external media (e.g. images).

Christoph Jansen's avatar
Christoph Jansen committed
147
**Code**: Code cells included in the notebooks are dual-licensed as CC-BY-SA 4.0 (as stated in the *Notebooks* section) and the MIT license. The MIT license provides an easy way to reuse code in other software projects, where a Creative Commons license would not be suitable.
Christoph Jansen's avatar
init  
Christoph Jansen committed
148

Christoph Jansen's avatar
Christoph Jansen committed
149
**Media**: The media directory contains subdirectories named after each content creator. A LICENSE.md is included in each of these subdirectories, to individually honor content creators by name.