readme.md 2.48 KB
Newer Older
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17

# Dataset generation

Under the `dataset/` directory are the tools for generating the datasets used
to train and test the neural networks. All of the included commands assume you
are working from the `dataset/` directory.

## Build

```
docker build -t gazebo .
```

## Run

### Run serial batch job (best method so far)

David Daish's avatar
David Daish committed
18 19 20
This runs the simulation serially with a simple `for` loop. It runs the
simulation as many times as specified in the `batch_size` enviroment variable.
Each simulation run produces around 30 datasets.
21 22

```
David Daish's avatar
David Daish committed
23
batch_size=1000
David Daish's avatar
David Daish committed
24 25
dataset_output_dir=~/Downloads/dataset
for i in $(seq $batch_size); do docker run --rm -v $dataset_output_dir:/mnt/dataset gazebo; done
26 27 28 29
```

### Run headless

David Daish's avatar
David Daish committed
30
This runs the simulation once, producing around 30 datasets.
31 32

```
David Daish's avatar
David Daish committed
33 34
dataset_output_dir=~/Downloads/dataset
docker run --rm -v $dataset_output_dir:/mnt/dataset gazebo
35 36 37 38 39 40 41
```

### Run with GUI

This runs the simulation once, with a GUI. Ensure that `xhost` permissions are
set so that the docker container can access the host's X-server.

David Daish's avatar
David Daish committed
42 43
Note that this tends to be a little buggy if the simulation is running at full
speed. You can reduce the speed by reducing the value in the `<max_step_size>`
44
tag found in the `include/lit_world.world` file.
David Daish's avatar
David Daish committed
45

46 47 48
```
xhost +

David Daish's avatar
David Daish committed
49 50
dataset_output_dir=~/Downloads/dataset

51 52
docker run -it --rm \
-e DISPLAY=$DISPLAY \
David Daish's avatar
David Daish committed
53
-v $dataset_output_dir:/mnt/dataset \
54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69
-v /tmp/.X11-unix:/tmp/.X11-unix:ro \
gazebo gazebo --verbose lit_world.world
```

### Run parallel batch job

This runs the simulation in parallel as many times as specified by the
`batch_size` enviroment varaible. strangely, dispite running in parallel, using
more CPU than the serial batch run, this parallel method runs just as quickly.
Wierd. Consquently, I do not recommend using this method, over the serial
method, I am leaving it here, in case whatever misuse I'm applying can be
removed later.

This method requires the GNU Parallel application.

```
David Daish's avatar
David Daish committed
70
batch_size=1000
David Daish's avatar
David Daish committed
71 72
dataset_output_dir=~/Downloads/dataset
seq $batch_size | parallel -j-2 "docker run --rm -v $dataset_output_dir:/mnt/dataset --name=plank-drop-container-{} gazebo"
73 74 75 76 77 78 79 80 81 82 83 84 85 86 87
```

## Other

In order to prevent having to re-download every single model for each docker
container, the `download_models.sh` script is included to allow you to download
a copy of the entire public gazebo model database in such a way that they will
be transferred to each new docker container created.

It will download the models specified in the `extra_models.txt` file.
Run it like this:

```
./download_models.sh
```