CONTAINERS.org 17.7 KB
Newer Older
pjotrp's avatar
pjotrp committed
1
2
3
4
-*- mode: org; coding: utf-8; -*-

#+TITLE: GNU Guix containers

pjotrp's avatar
pjotrp committed
5
6
7
8
9
10
11
12
13
14

* Table of Contents                                                     :TOC:
 - [[#introduction][Introduction]]
 - [[#running-a-container][Running a container]]
   - [[#usage][Usage]]
   - [[#browser][Browser]]
   - [[#running-windows-tools-in-wine][Running Windows tools in Wine]]
 - [[#docker][Docker]]
   - [[#providing-a-usable-docker-container][Providing a usable Docker container]]
   - [[#building-docker-image-of-conda-with-guix][Building Docker Image of Conda with Guix]]
pjotrp's avatar
pjotrp committed
15
 - [[#common-workflow-language-cwl][Common Workflow Language (CWL)]]
pjotrp's avatar
pjotrp committed
16

pjotrp's avatar
pjotrp committed
17
18
19
20
* Introduction

GNU Guix is an excellent implementation of Linux container managers
and compares favourably to other container systems, such as Docker.
pjotrp's avatar
pjotrp committed
21

pjotrp's avatar
pjotrp committed
22
23
24
25
26
27
28
29
30
31
In addition to the advantages that Guix offers as a deployment system,
Guix containers share the same software repository as the host, i.e.,
Guix containers are extremely light-weight! This is possible because
Guix software is immutable and versioned. And because it is Guix,
everything installation is both build and binary reproducible.

See also the official GNU Guix [[https://www.gnu.org/software/guix/manual/html_node/Invoking-guix-environment.html#][documentation]].

* Running a container

pjotrp's avatar
pjotrp committed
32
33
Containers can be run as regular users, provided the Kernel gives
permission.
pjotrp's avatar
pjotrp committed
34
35
36

** Usage

pjotrp's avatar
pjotrp committed
37
38
39
Give the package name(s), here emacs and coreutils (for ls etc.), you
want to have those added to the container (a Guix container is empty
by default):
pjotrp's avatar
pjotrp committed
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99

#+begin_src sh
    guix environment --container --network --ad-hoc emacs coreutils
#+end_src

You can run a command once:

#+begin_src sh
guix environment --ad-hoc --container coreutils -- df
#+end_src

prints the loaded home dir and the store profile:

#+begin_src sh
Filesystem                  1K-blocks      Used Available Use% Mounted on
none                          3956820         0   3956820   0% /dev
udev                            10240         0     10240   0% /dev/tty
tmpfs                           65536         0     65536   0% /dev/shm
/dev/sda1                    38057472  19874684  16226540  56% /export2/izip
/dev/mapper/volume_group-vm 165008748 109556608  47047148  70% /gnu/store/ikkks8c56g56znb5jgl737wkq7w9847c-profile
#+end_src

Note that 'guix environment --ad-hoc --container' will mount your
current working directory (here /export2/izip). If you start from an
empty $HOME/tmp directory - that will be mounted. Any files you put
here will be persistent between container runs.

Note you can point HOME to any path on startup from the shell

#+begin_src sh
guix environment --ad-hoc coreutils --container bash -- env HOME=$HOME/tmp/newhome/ bash
#+end_src

which allows you to run specific startup scripts and keep
configurations between runs.
** Browser

Run icecat, a browser, in a container with

#+begin_src sh
    guix environment --container --network --share=/tmp/.X11-unix
--ad-hoc icecat
    export DISPLAY=":0.0"
    icecat
#+end_src

You only need to install the package once.

** Running Windows tools in Wine

Wine can also be run in a container:

#+begin_src sh
    guix environment --container --network --share=/tmp/.X11-unix
--ad-hoc wine
    export DISPLAY=":0.0"
    wine explorer
#+end_src

which is great. I used to have to use VirtualBox and such to run the
Jonathan Brielmaier's avatar
Jonathan Brielmaier committed
100
occasional Windows tool. Now it runs in a container with access to
pjotrp's avatar
pjotrp committed
101
102
103
104
105
106
107
the local file system.

To run the tool in one go and set the HOME dir:

#+begin_src sh
guix environment --network --expose=/mnt/cdrom --share=/tmp/.X11-unix --container --ad-hoc wine vim bash coreutils -- env HOME=`pwd` DISPLAY=":0.0" wine explorer
#+end_src sh
pjotrp's avatar
Docker    
pjotrp committed
108
109
110
111
112
113
114

* Docker

Guix has its own containers using native Linux support, but you can
also run Guix in Docker and distribute software that way. One
interesting thing you can do is run guix 'pack' which creates a docker
image of a package with all its dependencies, see this [[https://www.gnu.org/software/guix/news/creating-bundles-with-guix-pack.html][description]].
pjotrp's avatar
pjotrp committed
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160

** Providing a usable Docker container

*** Install the package in the main /gnu/store

For a paper we made a compilation of bioinformatics software and put
it all in one GNU Guix [[https://gitlab.com/genenetwork/guix-bioinformatics/blob/master/gn/packages/book_evolutionary_genomics.scm#L113][package]] named book-evolutionary-genomics.  I
can install it using a local GUIX checkout commit
cc14a90fd3ce34a371175de610f9befcb2dad52b

#+begin_src shell
env GUIX_PACKAGE_PATH=../guix-bioinformatics \
  ./pre-inst-env guix package -p ~/opt/book-evolutionary-genomics \
  --no-grafts -i book-evolutionary-genomics \
  --substitute-urls="http://guix.genenetwork.org https://berlin.guixsd.org https://mirror.hydra.gnu.org"
#+end_src

resulting in a totally reproducible package.

*** Try things in a Guix container

Now we want to isolate them into a container.  To run these tools
inside a Guix container you can do like the earlier

#+begin_src shell
env GUIX_PACKAGE_PATH=../guix-bioinformatics/ \
  ./pre-inst-env guix environment --no-grafts --ad-hoc \
  --substitute-urls="http://guix.genenetwork.org https://berlin.guixsd.org https://mirror.hydra.gnu.org" \
  coreutils book-evolutionary-genomics vim screen \
  --container bash -- bash
#+end_src

starts up a bash shell in a clean container. For the book we have created
some scripts in the profile which can be found with the GUIX_ENVIRONMENT setting:

: cd $GUIX_ENVIRONMENT/share/book-evolutionary-genomics

The bin directory is on the PATH already, but for some scripts you may
want to create /usr/bin pointing to $GUIX_ENVIRONMENT/bin

: mkdir /usr
: ln -s $GUIX_ENVIRONMENT/bin /usr/bin

Note that /gnu/store is immutable and can therefore be shared with the
main system. This makes GNU Guix containers really small and fast.

pjotrp's avatar
pjotrp committed
161
*** Docker image
pjotrp's avatar
pjotrp committed
162

pjotrp's avatar
CWL    
pjotrp committed
163
With GNU Guix you can create a Docker image without actually installing Docker(!)
pjotrp's avatar
pjotrp committed
164
165
166
167

#+begin_src shell
env GUIX_PACKAGE_PATH=../guix-bioinformatics/ \
  ./pre-inst-env guix pack -f docker --no-grafts \
pjotrp's avatar
Docker    
pjotrp committed
168
  -S /usr/bin=/bin -S /etc/profile=/etc/profile \
pjotrp's avatar
Fix    
pjotrp committed
169
170
  -S /book-evolutionary-genomics=/share/book-evolutionary-genomics \
  coreutils book-evolutionary-genomics bash vim
pjotrp's avatar
pjotrp committed
171
172
173
174
175
#+end_src

note we now have the -S switch which can make the /usr/bin symlink
into the profile.

pjotrp's avatar
pjotrp committed
176
177
*** Run Docker

pjotrp's avatar
CWL    
pjotrp committed
178
This produced a file which we can be loaded into Docker
pjotrp's avatar
pjotrp committed
179

pjotrp's avatar
pjotrp committed
180
181
182
183
184
185
186
187
188
189
190
191
192
193
Docker is part of Guix too:

#+BEGIN_SRC sh
guix package -i docker containerd docker-cli -p ~/opt/docker
source ~/opt/docker/etc/profile
#+END_SRC

Start the ~dockerd~ as ~root~ and make sure permissions are set

#+BEGIN_SRC sh
groupadd docker
usermod -aG docker ${USER}
#+END_SRC

pjotrp's avatar
Docker    
pjotrp committed
194
: docker load --input /gnu/store/0p1ianjqqzbk1rr9rycaqcjdr2s13mcj-docker-pack.tar.gz
pjotrp's avatar
pjotrp committed
195
: docker images
pjotrp's avatar
Docker    
pjotrp committed
196
197
198
:   REPOSITORY          TAG                                IMAGE ID            CREATED             SIZE
:   profile             425c1ignnjixxzwdwdr5anywnq9mg50m   121f9cca6c55        47 years ago        1.43 GB

pjotrp's avatar
pjotrp committed
199
200
Now you should see the image id and you can run

pjotrp's avatar
Docker    
pjotrp committed
201
: docker run 121f9cca6c55 /usr/bin/ruby --version
pjotrp's avatar
pjotrp committed
202
203
204

Find the profile

pjotrp's avatar
Docker    
pjotrp committed
205
: docker run 121f9cca6c55 /usr/bin/ls /usr/bin -l
pjotrp's avatar
pjotrp committed
206
207
208

Read the profile settings

pjotrp's avatar
Docker    
pjotrp committed
209
210
211
212
213
: docker run 121f9cca6c55 cat /gnu/store/425c1ignnjixxzwdwdr5anywnq9mg50m-profile/etc/profile

But there is an easier way because we created the symlink earlier

: docker run 121f9cca6c55 cat /etc/profile
pjotrp's avatar
pjotrp committed
214
215
216

Run bioruby

pjotrp's avatar
Docker    
pjotrp committed
217
: docker run 121f9cca6c55 bash -c "env GEM_PATH=/gnu/store/425c1ignnjixxzwdwdr5anywnq9mg50m-profile//lib/ruby/gems/2.4.0 /gnu/store/425c1ignnjixxzwdwdr5anywnq9mg50m-profile/share/book-evolutionary-genomics/src/bioruby/DNAtranslate.rb
pjotrp's avatar
pjotrp committed
218
219
220

with input file

pjotrp's avatar
Docker    
pjotrp committed
221
222
223
224
225
226
: time docker run 121f9cca6c55 bash -c "env GEM_PATH=/gnu/store/425c1ignnjixxzwdwdr5anywnq9mg50m-profile//lib/ruby/gems/2.4.0 /gnu/store/425c1ignnjixxzwdwdr5anywnq9mg50m-profile/share/book-evolutionary-genomics/src/bioruby/DNAtranslate.rb /gnu/store/425c1ignnjixxzwdwdr5anywnq9mg50m-profile/share/book-evolutionary-genomics/test/data/test-dna.fa"

or the easy way since we created the links

: time docker run 121f9cca6c55 \
:   bash -c "source /etc/profile ; cd /book-evolutionary-genomics ; src/bioruby/DNAtranslate.rb test/data/test-dna.fa"
pjotrp's avatar
pjotrp committed
227

228
229
230
231
232
233
234
** Building Docker Image of Conda with Guix

*** Build the conda Archive

To build the pack from guix, the following command was run:

#+begin_src sh
235
./pre-inst-env guix pack -S /opt/gnu/bin=/bin conda
236
237
#+end_src sh

238
239
This builds an archive with `conda`. The package will be named something like
`/gnu/store/y2gylr1nz7qrj0p1xwfcg4n8pm0p4wgl-tarball-pack.tar.gz`
240
241
242
243
244
245
246
247
248
249
250

The `./pre-inst-env` portion can be dropped if you have a newer version of guix
that comes with conda in its list of packages. You can find out by running the
following command:

#+begin_src sh
guix package --search=conda
#+end_src sh

and looking through the list to see if there is a package named conda.

pjotrp's avatar
pjotrp committed
251
*** Bootstrapping the Images
252

253
254
From this step, there was need to bootstrap new images, based on a base image.
The base image chosen was the ubuntu image. You can get it with:
255
256

#+begin_src sh
257
docker pull ubuntu
258
259
#+end_src sh

260
261
The steps that follow will be somewhat similar, with each image building upon
the image before it.
262

pjotrp's avatar
CWL    
pjotrp committed
263
The files created here can be found
264
[[https://github.com/fredmanglis/guix-conda-docker/][in this repository]].
265
266

The first image to be built only contains conda, and it was initialised with a
pjotrp's avatar
pjotrp committed
267
new environment called `default-env`. This was done by writing a Docker file with
268
269
270
the following content:

#+begin_src dockerfile
271
272
273
274
FROM ubuntu:latest
COPY /gnu/store/y2gylr1nz7qrj0p1xwfcg4n8pm0p4wgl-tarball-pack.tar.gz /tmp/conda-pack.tar.gz
RUN tar -xzf /tmp/conda-pack.tar.gz && rm -f /tmp/conda-pack.tar.gz
RUN /opt/gnu/bin/conda create --name default-env
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
#+end_src dockerfile

This file was saved as `Dockerfile.conda` and then the image was built by
running

#+begin_src sh
docker build -t fredmanglis/guix-conda-plain:latest -f Dockerfile.conda .
#+end_src sh

Be careful not to miss the dot at the end of the command. This command creates a
new image, from the base image fredmanglis/guix-conda-base-img:latest and tags
the new image with the name fredmanglis/guix-conda-plain:latest

This new image is then used to bootstrap the next, by first creating a file
`Dockerfile.bioconda` and entering the following content into it:

#+begin_src dockerfile
FROM fredmanglis/guix-conda-plain:latest

RUN conda config --add channels r
RUN conda config --add channels defaults
RUN conda config --add channels conda-forge
RUN conda config --add channels bioconda
#+end_src dockerfile

This file instructs docker to bootstrap the new image from the image named
fredmanglis/guix-conda-plain:latest and then run the commands to add the
channels required to access the bioconda packages.

The new image, with bioconda initialised, is then created by running

#+begin_src sh
docker build -t fredmanglis/guix-bioconda:latest -f Dockerfile.bioconda .
#+end_src sh

Be careful not to miss the dot at the end of the command.

The next image to build contains the sambamba package from the bioconda channel.
We start by defining the image in a file, `Dockerfile.sambamba` which contains:

#+begin_src dockerfile
FROM fredmanglis/guix-bioconda:latest
317
RUN /opt/gnu/bin/conda install --yes --name default-env sambamba
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
#+end_src dockerfile

As can be seen, the package is installed in the environment `default-env`
defined while bootstrapping the image with conda only. This new image is
built with the command:

#+begin_src sh
docker build -t fredmanglis/guix-sambamba:latest -f Dockerfile.sambamba .
#+end_src sh

Do not miss the dot at the end of the command.

*** Publishing the Images

The images built in the processes above are all available at
https://hub.docker.com/r/fredmanglis/

To publish them, docker's push command was used, as follows:

#+begin_src sh
docker push fredmanglis/guix-conda-plain:latest && \
docker push fredmanglis/guix-bioconda:latest  && \
docker push fredmanglis/guix-sambamba:latest
#+end_src sh

343
These are really, three separate commands, in a sequence that only runs the later
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
commands if the ones before them ran successfully. This ensures that the derived
images are only uploaded after the images they are based on have been
successfully uploaded.

*** Get the Images

To get any of the images, use a command of the form:

#+begin_src sh
docker pull fredmanglis/<img-name>:<img-tag>
#+end_src sh

replacing <img-name> and <img-tag> with the actual image name and tag. For
example, to get the image with bioconda already set up, do:

#+begin_src sh
docker pull fredmanglis/guix-bioconda:latest
#+end_src sh
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381

*** Run Installed Applications

To run the applications installed, we need to set up the path correctly. To do
this, we make use of docker's --env-file option, in something similar to the
following:

#+begin_src bash
docker run --env-file=<file-with-env-vars> img-to-run:img-tag <command-to-run>
#+end_src bash

The <file-with-env-vars> can be found [[https://github.com/fredmanglis/guix-conda-docker/][here]].

Now you can proceed to run a command, for example:

#+begin_src sh
docker run --env-file=environment_variables --volume /tmp/sample:/data \
fredmanglis/guix-sambamba bash -c "sambamba view /data/test.bam"
#+end_src sh

382
383
384
the `--volume` option enables one to mount a specific directory to the docker
container that is created, so that the data is available to the running
commands.
pjotrp's avatar
pjotrp committed
385
386
387

* Common Workflow Language (CWL)

pjotrp's avatar
Docker    
pjotrp committed
388
389
390
391
392
393
394
395
396
397
CWL can use Docker images to pull containers, for example for [[https://github.com/common-workflow-library/bio-cwl-tools/blob/61ffac1862822f08dc20b6f8e2f22634b986b0bc/odgi/odgi_build.cwl][OGDI]]. CWL is
agnostic to how these containers are sourced.

For [[http://covid19.genenetwork.org/][COVID-19 PubSeq]] [[https://github.com/vgteam/odgi][ODGI]] was required in a CWL [[https://github.com/arvados/bh20-seq-resource/blob/master/workflows/pangenome-generate/odgi_to_rdf.cwl][module]] to [[https://github.com/arvados/bh20-seq-resource/commit/618f956eb03c6a6ad1cc16efc931f55b0dce83e1][build]] a graph
and generate RDF. The CWL to build the graph is [[ttps://github.com/arvados/bh20-seq-resource/blob/master/workflows/pangenome-generate/odgi-build.cwl][here]]. The quickest way
to get an up-to-date working Docker container was by using GNU
Guix. ODGI is currently maintained and packaged in an external
[[https://github.com/ekg/guix-genomics/blob/16b272722013a101067117739f8c4de91390f49a/odgi.scm#L1][guix-genomics]] repo by Erik Garrison. It is simply a matter of adding a
channel or by using the ~GUIX_PACKAGE_PATH~ after a git clone of
guix-genomics we build odgi in a [[./PROFILE.org][profile]]
pjotrp's avatar
pjotrp committed
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469

#+BEGIN_SRC sh
env GUIX_PACKAGE_PATH=~/guix-genomics ~/.config/guix/current/bin/guix package -i odgi -p ~/opt/vgtools
#+END_SRC

and a quick test shows

#+BEGIN_SRC sh
tux01:~$ ~/opt/vgtools/bin/odgi
odgi: dynamic succinct variation graph tool, version #<procedure version ()>

usage: /home/pjotr/opt/vgtools/bin/odgi <command> [options]

main mapping and calling pipeline:
  -- build         build dynamic succinct variation graph
  -- stats         describe the graph and its path relationships
  -- sort          sort a variation graph
  -- view          projection of graphs into other formats
  -- kmers         process and dump the kmers of the graph
  -- unitig        emit the unitigs of the graph
  -- viz           visualize the graph
  -- paths         interrogation and manipulation of paths
  -- prune         prune the graph based on coverage or topological complexity
  -- unchop        merge unitigs into single nodes
  -- normalize     compact unitigs and simplify redundant furcations
  -- subset        extract subsets of the graph as defined by query criteria
  -- bin           bin path information across the graph
  -- matrix        graph topology in sparse matrix form
  -- chop          chop long nodes into short ones while preserving topology
  -- groom         resolve spurious inverting links
  -- layout        use SGD to make 2D layouts of the graph
  -- flatten       project the graph sequence and paths into FASTA and BED
  -- break         break cycles in the graph
  -- pathindex     create a path index for a given graph
  -- panpos        get the pangenome position for a given path and nucleotide position (1-based)
  -- server        start a HTTP server with a given index file to query a pangenome position
  -- version       get the git version of odgi
  -- test          run unit tests

For more commands, type `odgi help`.
#+END_SRC

Now can try building a Guix container with

#+BEGIN_SRC sh
env GUIX_PACKAGE_PATH=~/guix-genomics ~/.config/guix/current/bin/guix environment -C --ad-hoc odgi
odgi
#+END_SRC

yes, that works too. Great, now we package a Docker image

#+BEGIN_SRC sh
env GUIX_PACKAGE_PATH=~/guix-genomics ~/.config/guix/current/bin/guix pack -f docker odgi
#+END_SRC

which created a container in
~/gnu/store/d68qyyvqchlgq3lzh3qgmlg9k42c9yas-docker-pack.tar.gz~ of
size 30MB. Tiny!

After installing docker (part of GNU Guix) you can test

#+BEGIN_SRC sh
docker load --input d68qyyvqchlgq3lzh3qgmlg9k42c9yas-docker-pack.tar.gz
docker images
REPOSITORY          TAG                 IMAGE ID            CREATED             SIZE
odgi                latest              5351dc5d4fc8        50 years ago        102MB

docker run 5351dc5d4fc8 odgi
  odgi: dynamic succinct variation graph tool, version #<procedure version ()>
  etc.
#+END_SRC

pjotrp's avatar
Docker    
pjotrp committed
470
471
472
473
474
475
476
477
478
479
480
481
482
483
It works! Only a request came to add bash and coreutils. So I made
a slightly larger one, also putting all binaries in the /bin path so
/bin/sh and /bin/odgi work

#+BEGIN_SRC sh
env GUIX_PACKAGE_PATH=~/guix-genomics ~/.config/guix/current/bin/guix pack -f docker odgi bash coreutils binutils --substitute-urls="http://guix.genenetwork.org https://berlin.guixsd.org https://ci.guix.gnu.org https://mirror.hydra.gnu.org"  -S /bin=bin
#+END_SRC

It runs, for example

: docker run 0dcb42977ec2 odgi
: docker run 0dcb42977ec2 sh
: docker run 0dcb42977ec2 /bin/sh
: docker run 0dcb42977ec2 /bin/bash -c ls
pjotrp's avatar
pjotrp committed
484
485

Next we make it available for general use. I pushed it to IPFS
pjotrp's avatar
Docker    
pjotrp committed
486
for [[http://ipfs.genenetwork.org/ipfs/QmZmjG6Yc5tKwMATetZsnqReTxMtQ75RcsqEc3vYVAPLDk/odgi][sharing]].