2017-11-02-automating-boring-git-operations-gitlab-ci.html.md 17.4 KB
Newer Older
Rebecca Dodd's avatar
Rebecca Dodd committed
1 2 3 4 5
---
title: "GitBot  automating boring Git operations with CI"
author: Kristian Larsson
author_gitlab:
author_twitter: plajjan
Rebecca Dodd's avatar
Rebecca Dodd committed
6
categories: engineering
Rebecca Dodd's avatar
Rebecca Dodd committed
7 8 9
guest: true
image_title: '/images/blogimages/gitbot-automate-git-operations.jpg'
description: "Guest author Kristian Larsson shares how he automates some common Git operations, like rebase, using GitLab CI."
William Chia's avatar
William Chia committed
10
tags: CI/CD, user stories, git
Rebecca Dodd's avatar
Rebecca Dodd committed
11 12 13
---

Git is super useful for anyone doing a bit of development work or just trying to
Rebecca Dodd's avatar
Rebecca Dodd committed
14 15
keep track of a bunch of text files. However, as your project grows you might
find yourself doing lots of boring repetitive work just around Git itself. At
Rebecca Dodd's avatar
Rebecca Dodd committed
16
least that’s what happened to me and so I automated some boring Git stuff using our
Rebecca Dodd's avatar
Rebecca Dodd committed
17 18 19 20
CI system.

<!-- more -->

Rebecca Dodd's avatar
Rebecca Dodd committed
21
There are probably all sorts of use cases for automating various Git operations
Rebecca Dodd's avatar
Rebecca Dodd committed
22
but I’ll talk about a few that I’ve encountered. We’re using GitLab and [GitLab
Rebecca Dodd's avatar
Rebecca Dodd committed
23
CI](/features/gitlab-ci-cd/) so that’s what my examples
Rebecca Dodd's avatar
Rebecca Dodd committed
24
will include, but most of the concepts should apply to other systems as well.
Rebecca Dodd's avatar
Rebecca Dodd committed
25 26 27

## Automatic rebase

Rebecca Dodd's avatar
Rebecca Dodd committed
28 29 30 31 32 33 34 35
We have some Git repos with code that we receive from vendors, who we can think
of as our `upstream`. We don’t actually share a Git repo with the vendor but
rather we get a tar ball every now and then. The tar ball is extracted into a
Git repository, on the `master` branch which thus tracks the software as it is
received from upstream. In a perfect world the software we receive would be
feature complete and bug free and so we would be done, but that’s usually not
the case. We do find bugs and if they are blocking we might decide to implement
a patch to fix them ourselves. The same is true for new features where we might
Rebecca Dodd's avatar
Rebecca Dodd committed
36 37
not want to wait for the vendor to implement it.

Rebecca Dodd's avatar
Rebecca Dodd committed
38 39
The result is that we have some local patches to apply. We commit such patches
to a separate branch, commonly named `ts` (for TeraStream), to keep them
Rebecca Dodd's avatar
Rebecca Dodd committed
40
separate from the official software. Whenever a new software version is released,
Rebecca Dodd's avatar
Rebecca Dodd committed
41
we extract its content to `master` and then rebase our `ts` branch onto `master`
Rebecca Dodd's avatar
Rebecca Dodd committed
42 43
so we get all the new official features together with our patches. Once we’ve
implemented something we usually send it upstream to the vendor for inclusion.
Rebecca Dodd's avatar
Rebecca Dodd committed
44 45 46
Sometimes they include our patches verbatim so that the next version of the code
will include our exact patch, in which case a rebase will simply skip our patch.
Other times there are slight or major (it might be a completely different design)
Rebecca Dodd's avatar
Rebecca Dodd committed
47
changes to the patch and then someone typically needs to sort out the patches
Rebecca Dodd's avatar
Rebecca Dodd committed
48 49 50
manually. Mostly though, rebasing works just fine and we don’t end up with conflicts.

Now, this whole rebasing process gets a tad boring and repetitive after a while,
Rebecca Dodd's avatar
Rebecca Dodd committed
51
especially considering we have a dozen of repositories with the setup described
Rebecca Dodd's avatar
Rebecca Dodd committed
52 53 54 55 56 57 58 59
above. What I recently did was to automate this using our CI system.

The workflow thus looks like:

- human extracts zip file, git add + git commit on master + git push
- CI runs for `master` branch
   - clones a copy of itself into a new working directory
   - checks out `ts` branch (the one with our patches) in working directory
Rebecca Dodd's avatar
Rebecca Dodd committed
60 61
   - rebases `ts` onto `master`
   - push `ts` back to `origin`
Rebecca Dodd's avatar
Rebecca Dodd committed
62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105
- this will now trigger a CI build for the `ts` branch
- when CI runs for the `ts` branch, it will compile, test and save the binary output as “build artifacts”, which can be included in other repositories
- GitLab CI, which is what we use, has a CI_PIPELINE_ID that we use to version built container images or artifacts

To do this, all you need is a few lines in a .gitlab-ci.yml file, essentially;

```
stages:
  - build
  - git-robot

... build jobs ...

git-rebase-ts:
  stage: git-robot
  only:
    - master
  allow_failure: true
  before_script:
    - 'which ssh-agent || ( apt-get update -y && apt-get install openssh-client -y )'
    - eval $(ssh-agent -s)
    - ssh-add <(echo "$GIT_SSH_PRIV_KEY")
    - git config --global user.email "kll@dev.terastrm.net"
    - git config --global user.name "Mr. Robot"
    - mkdir -p ~/.ssh
    - cat gitlab-known-hosts >> ~/.ssh/known_hosts
  script:
    - git clone git@gitlab.dev.terastrm.net:${CI_PROJECT_PATH}.git
    - cd ${CI_PROJECT_NAME}
    - git checkout ts
    - git rebase master
    - git push --force origin ts
  ```

We’ll go through it a few lines at a time. Some basic knowledge about GitLab CI is assumed.

This first part lists the stages of our pipeline.

```
stages:
  - build
  - git-robot
  ```

Rebecca Dodd's avatar
Rebecca Dodd committed
106
We have two stages, first the `build` stage, which does whatever you want it to
Rebecca Dodd's avatar
Rebecca Dodd committed
107 108 109 110 111 112 113 114 115 116 117 118 119
do (ours compiles stuff, runs a few unit tests and packages it all up), then the
git-robot stage which is where we perform the rebase.

Then there’s:

```
git-rebase-ts:
  stage: git-robot
  only:
    - master
  allow_failure: true
  ```

Rebecca Dodd's avatar
Rebecca Dodd committed
120
We define the stage in which we run followed by the only statement which limits
Rebecca Dodd's avatar
Rebecca Dodd committed
121 122 123 124
CI jobs to run only on the specified branch(es), in this case `master`.

`allow_failure` simply allows the CI job to fail but still passing the pipeline.

Rebecca Dodd's avatar
Rebecca Dodd committed
125
Since we are going to clone a copy of ourselves (the repository checked out in
126
CI) we need SSH and SSH keys set up. We’ll use ssh-agent with a password-less key
Rebecca Dodd's avatar
Rebecca Dodd committed
127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154
to authenticate. Generate a key using ssh-keygen, for example:

```
ssh-keygen

kll@machine ~ $ ssh-keygen -f foo
Generating public/private rsa key pair.
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in foo.
Your public key has been saved in foo.pub.
The key fingerprint is:
SHA256:6s15MZJ1/kUsDU/PF2WwRGA963m6ZSwHvEJJdsRzmaA kll@machine
The key's randomart image is:
+---[RSA 2048]----+
|            o**.*|
|           ..o**o|
|           Eo o%o|
|          .o.+o O|
|        So oo.o+.|
|       .o o.. o+o|
|      .  . o..o+=|
|     . o ..  .o= |
|      . +.    .. |
+----[SHA256]-----+
kll@machine ~ $
```

Rebecca Dodd's avatar
Rebecca Dodd committed
155
Add the public key as a deploy key under Project Settings
156
<i class="fas fa-arrow-right" aria-hidden="true"></i> Repository <i class="fas fa-arrow-right" aria-hidden="true"></i>
Rebecca Dodd's avatar
Rebecca Dodd committed
157 158
Deploy Keys. Make sure you enable write access or you won’t be able to have your
Git robot push commits. We then need to hand over the private key so that it can
Rebecca Dodd's avatar
Rebecca Dodd committed
159 160
be accessed from within the CI job. We’ll use a secret environment variable for
that, which you can define under Project Settings
161
<i class="fas fa-arrow-right" aria-hidden="true"></i> Pipelines <i class="fas fa-arrow-right" aria-hidden="true"></i>
Rebecca Dodd's avatar
Rebecca Dodd committed
162
Environment variables). I’ll use the environment variable GIT_SSH_PRIV_KEY for this.
Rebecca Dodd's avatar
Rebecca Dodd committed
163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178

Next part is the before_script:

```
  before_script:
    - 'which ssh-agent || ( apt-get update -y && apt-get install openssh-client -y )'
    - eval $(ssh-agent -s)
    - ssh-add <(echo "$GIT_SSH_PRIV_KEY")
    - git config --global user.email "kll@dev.terastrm.net"
    - git config --global user.name "Mr. Robot"
    - mkdir -p ~/.ssh
    - cat gitlab-known-hosts >> ~/.ssh/known_hosts
  ```

First ssh-agent is installed if it isn’t already. We then start up ssh-agent and
add the key stored in the environment variable GIT_SSH_PRIV_KEY (which we set up
Rebecca Dodd's avatar
Rebecca Dodd committed
179 180
previously). The Git user information is set and we finally create .ssh and add
the known host information about our GitLab server to our known_hosts file. You
Rebecca Dodd's avatar
Rebecca Dodd committed
181 182 183 184 185 186
can generate the gitlab-known-hosts file using the following command:

```
ssh-keyscan my-gitlab-machine >> gitlab-known-hosts
```

Rebecca Dodd's avatar
Rebecca Dodd committed
187 188 189 190
As the name implies, the before_script is run before the main `script` part and
the ssh-agent we started in the before_script will also continue to run for the
duration of the job. The ssh-agent information is stored in some environment
variables which are carried across from the before_script into the main script,
Rebecca Dodd's avatar
Rebecca Dodd committed
191 192 193
enabling it to work. It’s also possible to put this SSH setup in the main script,
I just thought it looked cleaner splitting it up between before_script and script.
Note however that it appears that after_script behaves differently so while it’s
Rebecca Dodd's avatar
Rebecca Dodd committed
194 195
possible to pass environment vars from before_script to script, they do not
appear to be passed to after_script. Thus, if you want to do Git magic in the
Rebecca Dodd's avatar
Rebecca Dodd committed
196 197
after_script you also need to perform the SSH setup in the after_script.

Rebecca Dodd's avatar
Rebecca Dodd committed
198 199
This brings us to the main script. In GitLab CI we already have a checked-out
clone of our project but that was automatically checked out by the CI system
Rebecca Dodd's avatar
Rebecca Dodd committed
200
through the use of magic (it actually happens in a container previous to the one
Rebecca Dodd's avatar
Rebecca Dodd committed
201 202 203
we are operating in, that has some special credentials) so we can’t really use
it, besides, checking out other branches and stuff would be really weird as it
disrupts the code we are using to do this, since that’s available in the Git
Rebecca Dodd's avatar
Rebecca Dodd committed
204 205
repository that’s checked out. It’s all rather meta.

Rebecca Dodd's avatar
Rebecca Dodd committed
206
Anyway, we’ll be checking out a new Git repository where we’ll do our work, then
Rebecca Dodd's avatar
Rebecca Dodd committed
207
change the current directory to the newly checked-out repository, after which
Rebecca Dodd's avatar
Rebecca Dodd committed
208
we’ll check out the `ts` branch, do the rebase and push it back to the origin remote.
Rebecca Dodd's avatar
Rebecca Dodd committed
209 210 211 212 213 214 215 216 217

```
    - git clone git@gitlab.dev.terastrm.net:${CI_PROJECT_PATH}.git
    - cd ${CI_PROJECT_NAME}
    - git checkout ts
    - git rebase master
    - git push --force origin ts
  ```

Rebecca Dodd's avatar
Rebecca Dodd committed
218 219 220
… and that’s it. We’ve now automated the rebasing of a branch. Occasionally it
will fail due to problems rebasing (most commonly merge conflicts) but then you
can just step in and do the above steps manually and be interactively prompted
Rebecca Dodd's avatar
Rebecca Dodd committed
221 222 223 224
on how to handle conflicts.

## Automatic merge requests

Rebecca Dodd's avatar
Rebecca Dodd committed
225 226
All the repositories I mentioned in the previous section are NEDs, a form of
driver for how to communicate with a certain type of device, for Cisco NSO (a
Rebecca Dodd's avatar
Rebecca Dodd committed
227
network orchestration system). We package up Cisco NSO, together with these NEDs
Rebecca Dodd's avatar
Rebecca Dodd committed
228
and our own service code, in a container image. The build of that image is
Rebecca Dodd's avatar
Rebecca Dodd committed
229
performed in CI and we use a repository called `nso-ts` to control that work.
Rebecca Dodd's avatar
Rebecca Dodd committed
230 231

The NEDs are compiled in CI from their own repository and the binaries are saved
Rebecca Dodd's avatar
Rebecca Dodd committed
232
as build artifacts. Those artifacts can then be pulled in the CI build of `nso-ts`.
Rebecca Dodd's avatar
Rebecca Dodd committed
233
The reference to which artifact to include is the name of the NED as well as the
Rebecca Dodd's avatar
Rebecca Dodd committed
234 235
build version. The version number of the NED is nothing more than the pipeline
id (which you’ll access in CI as ${CI_PIPELINE_ID}) and by including a specific
Rebecca Dodd's avatar
Rebecca Dodd committed
236 237 238 239 240
version of the NED, rather than just use “latest” we gain a much more consistent
and reproducible build.

Whenever a NED is updated a new build is run that produces new binary artifacts.
We probably want to use the new version but not before we test it out in CI. The
Rebecca Dodd's avatar
Rebecca Dodd committed
241
actual versions of NEDs to use is stored in a file in the `nso-ts` repository and
Rebecca Dodd's avatar
Rebecca Dodd committed
242 243 244 245 246 247 248 249 250 251
follows a simple format, like this:

```
ned-iosxr-yang=1234
ned-junos-yang=4567
...
```

Thus, updating the version to use is a simple job to just rewrite this text file
and replace the version number with a given CI_PIPELINE_ID version number. Again,
Rebecca Dodd's avatar
Rebecca Dodd committed
252
while NED updates are more seldom than updates to `nso-ts`, they do occur and
Rebecca Dodd's avatar
Rebecca Dodd committed
253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282
handling it is bloody boring. Enter automation!

```
git-open-mr:
  image: gitlab.dev.terastrm.net:4567/terastream/cisco-nso/ci-cisco-nso:4.2.3
  stage: git-robot
  only:
    - ts
  tags:
    - no-docker
  allow_failure: true
  before_script:
    - 'which ssh-agent || ( apt-get update -y && apt-get install openssh-client -y )'
    - eval $(ssh-agent -s)
    - ssh-add <(echo "$GIT_SSH_PRIV_KEY")
    - git config --global user.email "kll@dev.terastrm.net"
    - git config --global user.name "Mr. Robot"
    - mkdir -p ~/.ssh
    - cat gitlab-known-hosts >> ~/.ssh/known_hosts
  script:
    - git clone git@gitlab.dev.terastrm.net:TeraStream/nso-ts.git
    - cd nso-ts
    - git checkout -b robot-update-${CI_PROJECT_NAME}-${CI_PIPELINE_ID}
    - for LIST_FILE in $(ls ../ned-package-list.* | xargs -n1 basename); do NED_BUILD=$(cat ../${LIST_FILE}); sed -i packages/${LIST_FILE} -e "s/^${CI_PROJECT_NAME}.*/${CI_PROJECT_NAME}=${NED_BUILD}/"; done
    - git diff
    - git commit -a -m "Use ${CI_PROJECT_NAME} artifacts from pipeline ${CI_PIPELINE_ID}"
    - git push origin robot-update-${CI_PROJECT_NAME}-${CI_PIPELINE_ID}
    - HOST=${CI_PROJECT_URL} CI_COMMIT_REF_NAME=robot-update-${CI_PROJECT_NAME}-${CI_PIPELINE_ID} CI_PROJECT_NAME=TeraStream/nso-ts GITLAB_USER_ID=${GITLAB_USER_ID} PRIVATE_TOKEN=${PRIVATE_TOKEN} ../open-mr.sh
```

Rebecca Dodd's avatar
Rebecca Dodd committed
283 284 285 286
So this time around we check out a Git repository into a separate working
directory again, it’s just that it’s not the same Git repository as we are
running on simply because we are trying to do changes to a repository that is
using the output of the repository we are running on. It doesn’t make much of a
Rebecca Dodd's avatar
Rebecca Dodd committed
287
difference in terms of our process. At the end, once we’ve modified the files we
Rebecca Dodd's avatar
Rebecca Dodd committed
288 289
are interested in, we also open up a merge request on the target repository.
Here we can see the MR (which is merged already) to use a new version of the
Rebecca Dodd's avatar
Rebecca Dodd committed
290 291
NED `ned-snabbaftr-yang`.

Rebecca Dodd's avatar
Rebecca Dodd committed
292 293 294 295 296 297 298 299
<img src="/images/blogimages/gitbot-ned-update-mr.png" alt="MR using new version of NED" style="width: 700px;"/>{: .shadow}

What we end up with is that whenever there is a new version of a NED, a merge
request is opened on our `nso-ts` repository to start using the new NED. That
merge request is using changes on a new branch and CI will obviously run for
`nso-ts` on this new branch, which will then test all of our code using the new
version of the NED. We get a form of version pinning, with the form of explicit
changes that it entails, yet it’s a rather convenient and non-cumbersome
Rebecca Dodd's avatar
Rebecca Dodd committed
300 301 302 303
environment to work with thanks to all the automation.

## Getting fancy

Rebecca Dodd's avatar
Rebecca Dodd committed
304 305 306 307 308
While automatically opening an MR is sweet… we can do ~~better~~fancier. Our `nso-ts`
repository is based on Cisco NSO (Tail-F NCS), or actually the `nso-ts` Docker
image is based on a `cisco-nso` Docker image that we build in a separate
repository. We put the version of NSO as the tag of the `cisco-nso` Docker
image, so `cisco-nso:4.2.3` means Cisco NSO 4.2.3. This is what the `nso-ts`
Rebecca Dodd's avatar
Rebecca Dodd committed
309 310
Dockerfile will use in its `FROM` line.

Rebecca Dodd's avatar
Rebecca Dodd committed
311 312 313
Upgrading to a new version of NCS is thus just a matter of rewriting the tag…
but what version of NCS should we use? There’s 4.2.4, 4.3.3, 4.4.2 and 4.4.3
available and I’m sure there’s some other version that will pop up its evil
Rebecca Dodd's avatar
Rebecca Dodd committed
314 315 316 317
head soon enough. How do I know which version to pick? And will our current code
work with the new version?

To help myself in the choice of NCS version I implemented a script that gets the
Rebecca Dodd's avatar
Rebecca Dodd committed
318 319 320 321
README file of a new NCS version and cross references the list of fixed issues
with the issues that we currently have open in the Tail-F issue tracker. The
output of this is included in the merge request description so when I look at
the merge request I immediately know what bugs are fixed or new features are
Rebecca Dodd's avatar
Rebecca Dodd committed
322
implemented by moving to a specific version. Having this automatically generated
Rebecca Dodd's avatar
Rebecca Dodd committed
323
for us is… well, it’s just damn convenient. Together with actually testing our
Rebecca Dodd's avatar
Rebecca Dodd committed
324 325 326 327
code with the new version of NCS gives us confidence that an upgrade will be smooth.

Here are the merge requests currently opened by our GitBot:

Rebecca Dodd's avatar
Rebecca Dodd committed
328
<img src="/images/blogimages/automate-git-merge-requests.png" alt="Merge requests automated by Git bot" style="width: 700px;"/>{: .shadow}
Rebecca Dodd's avatar
Rebecca Dodd committed
329

Rebecca Dodd's avatar
Rebecca Dodd committed
330 331 332 333
We can see how the system have generated MRs to move to all the different
versions of NSO currently available. As we are currently on NSO v4.2.3 there’s
no underlying branch for that one leading to an errored build. For the other
versions though, there is a branch per version that executes the CI pipeline to
Rebecca Dodd's avatar
Rebecca Dodd committed
334 335
make sure all our code runs with this version of NSO.

Rebecca Dodd's avatar
Rebecca Dodd committed
336 337
As there have been a few commits today, these branches are behind by six commits
but will be rebased this night so we get an up-to-date picture if they work or
Rebecca Dodd's avatar
Rebecca Dodd committed
338 339
not with our latest code.

Rebecca Dodd's avatar
Rebecca Dodd committed
340
<img src="/images/blogimages/automate-git-commits.png" alt="Commits" style="width: 700px;"/>{: .shadow}
Rebecca Dodd's avatar
Rebecca Dodd committed
341

Rebecca Dodd's avatar
Rebecca Dodd committed
342 343
If we go back and look at one of these merge requests, we can see how the
description includes information about what issues that we currently have open
Rebecca Dodd's avatar
Rebecca Dodd committed
344 345
with Cisco / Tail-F would be solved by moving to this version.

Rebecca Dodd's avatar
Rebecca Dodd committed
346
<img src="/images/blogimages/automate-git-mr-description.png" alt="Merge request descriptions" style="width: 700px;"/>{: .shadow}
Rebecca Dodd's avatar
Rebecca Dodd committed
347

Rebecca Dodd's avatar
Rebecca Dodd committed
348
This is from v4.2.4 and as we are currently on v4.2.3 we can see that there are
Rebecca Dodd's avatar
Rebecca Dodd committed
349 350 351 352
only a few fixed issues.

If we instead look at v4.4.3 we can see that the list is significantly longer.

Rebecca Dodd's avatar
Rebecca Dodd committed
353
<img src="/images/blogimages/automate-git-mr-description-list.png" alt="Merge request descriptions" style="width: 700px;"/>{: .shadow}
Rebecca Dodd's avatar
Rebecca Dodd committed
354 355 356 357 358 359 360

Pretty sweet, huh? :)

As this involves a bit more code I’ve put the relevant files in a [GitHub gist](https://gist.github.com/plajjan/42592665afd5ae045ee36220e19919aa).

# This is the end

Rebecca Dodd's avatar
Rebecca Dodd committed
361 362
If you are reading this, chances are you already have your reasons for why you
want to automate some Git operations. Hopefully I’ve provided some inspiration
Rebecca Dodd's avatar
Rebecca Dodd committed
363 364 365 366 367
for how to do it.

If not or if you just want to discuss the topic in general or have more specific
questions about our setup, please do reach out to me on [Twitter](https://twitter.com/plajjan).

Rebecca Dodd's avatar
Rebecca Dodd committed
368 369
_[This post](http://plajjan.github.io/automating-git/) was originally published on [plajjan.github.io](http://plajjan.github.io/)._

Rebecca Dodd's avatar
Rebecca Dodd committed
370 371
## About the Guest Author

Rebecca Dodd's avatar
Rebecca Dodd committed
372 373 374 375 376 377
Kristian Larsson is a network automation systems architect at Deutsche Telekom.
He is working on automating virtually all aspects of running TeraStream, the
design for Deutsche Telekom's next generation fixed network, using robust and
fault tolerant software. He is active in the IETF as well as being a
representing member in OpenConfig. Previous to joining Deutsche Telekom,
Kristian was the IP & opto network architect for Tele2's international backbone
Rebecca Dodd's avatar
Rebecca Dodd committed
378 379 380 381
network.

"[BB-8 in action](https://unsplash.com/photos/C8VWyZhcIIU) by [Joseph Chan](https://unsplash.com/@yulokchan) on Unsplash
{: .note}