Skip to content

Draft: Add Prophet to Sidekiq

Sean McGivern requested to merge add-prophet into master

What does this MR do?

Do not merge! This was just an exploration for https://gitlab.com/gitlab-org/gitlab/-/issues/403630 to see how much work it is to add Prophet to our existing Ruby images, which also bring along a Python install for ReST support.


This installs the Python Prophet library (https://facebook.github.io/prophet/), which depends on NumPy, into the Python install for our Sidekiq images.

To get this to work, we need BZ2 support in the Python image. With that added, this works:

# In gitlab-sidekiq
$ docker-build .
Sending build context to Docker daemon  10.75kB
Step 1/20 : ARG CI_REGISTRY_IMAGE="registry.gitlab.com/gitlab-org/build/cng"
# ... snip ...
Installing collected packages: pytz, pymeeus, korean-lunar-calendar, ephem, zipp, tqdm, six, pyparsing, pillow, packaging, numpy, kiwisolver, hijri-converter, fonttools, cycler, convertdate, backports.zoneinfo, python-dateutil, importlib-resources, contourpy, pandas, matplotlib, LunarCalendar, holidays, cmdstanpy, prophet
# ... snip ...
Successfully built 5735bf7e6d7f
$ docker run -it 5735bf7e6d7f python3
Begin parsing .erb templates from /srv/gitlab/config
Begin parsing .tpl templates from /srv/gitlab/config
Python 3.8.16 (default, Apr  3 2023, 12:48:42) 
[GCC 10.2.1 20210110] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import prophet
Importing plotly failed. Interactive plots will not work.
>>> prophet.Prophet
<class 'prophet.forecaster.Prophet'>

It does make the image larger by quite a bit. Here, c872833209fd is the result of docker build . in gitlab-sidekiq on the current master branch, and 5735bf7e6d7f is the same but on this branch:

$ docker images
REPOSITORY                                                   TAG               IMAGE ID       CREATED          SIZE
<none>                                                       <none>            c872833209fd   24 seconds ago   1.74GB
<none>                                                       <none>            5735bf7e6d7f   41 minutes ago   2.03GB

The difference seems to be in the site-packages directory, which makes sense, as Prophet has a lot of dependencies.

# Current image
git@c99e7a9db3a9:/$ du -sh /usr/local/lib/python3.8/site-packages/
20M     /usr/local/lib/python3.8/site-packages/

# With Prophet
git@536e64ef724e:/$ du -sh /usr/local/lib/python3.8/site-packages/
256M    /usr/local/lib/python3.8/site-packages/

# Current image
$ docker history c872833209fd
IMAGE          CREATED          CREATED BY                                      SIZE      COMMENT
c872833209fd   7 minutes ago    /bin/sh -c #(nop)  CMD ["/scripts/process-wr…   0B        
0e9ea0175592   7 minutes ago    /bin/sh -c #(nop)  USER git:git                 0B        
038c74c8203b   7 minutes ago    |2 DOCUTILS_VERSION=0.19 GITLAB_USER=git /bi…   15.4kB    
e1938830f1ee   7 minutes ago    /bin/sh -c #(nop) COPY dir:aa6cc28eeda61ae68…   1.38kB    
e9783df03a57   7 minutes ago    |2 DOCUTILS_VERSION=0.19 GITLAB_USER=git /bi…   23.4MB    
a169c7f87ea9   7 minutes ago    /bin/sh -c #(nop)  ENV SIDEKIQ_TIMEOUT=25       0B        
95c150301558   7 minutes ago    /bin/sh -c #(nop)  ENV SIDEKIQ_CONCURRENCY=25   0B        
a514ed48312f   7 minutes ago    |2 DOCUTILS_VERSION=0.19 GITLAB_USER=git /bi…   10.2MB    
02dde68653a4   7 minutes ago    /bin/sh -c #(nop)  ENV PYTHONPATH=/usr/local…   0B        
feb89ac0cdb7   7 minutes ago    /bin/sh -c #(nop) COPY dir:7aed1634d6e99dc61…   96.7MB    
e13436ed2544   7 minutes ago    /bin/sh -c #(nop) COPY dir:4810d640c93b39579…   22.5kB    
a76c7ec0f8ce   57 minutes ago   /bin/sh -c #(nop)  ARG DOCUTILS_VERSION=0.19    0B        
1cc600a2a082   57 minutes ago   /bin/sh -c #(nop)  ARG GITLAB_USER=git          0B        
8a96a14f4e2b   4 hours ago      VOLUME [/var/opt/gitlab /var/log]               0B        buildkit.dockerfile.v0
<missing>      4 hours ago      ENV RAILS_ENV=production BOOTSNAP_CACHE_DIR=…   0B        buildkit.dockerfile.v0
<missing>      4 hours ago      RUN |3 DATADIR=/var/opt/gitlab CONFIG=/srv/g…   14kB      buildkit.dockerfile.v0
<missing>      4 hours ago      COPY scripts/ /scripts # buildkit               12kB      buildkit.dockerfile.v0
<missing>      4 hours ago      RUN |3 DATADIR=/var/opt/gitlab CONFIG=/srv/g…   12.1kB    buildkit.dockerfile.v0
<missing>      4 hours ago      COPY /var/opt/gitlab /var/opt/gitlab # build…   0B        buildkit.dockerfile.v0
<missing>      4 hours ago      COPY /srv/gitlab /srv/gitlab # buildkit         1.02GB    buildkit.dockerfile.v0
<missing>      4 hours ago      RUN |3 DATADIR=/var/opt/gitlab CONFIG=/srv/g…   334kB     buildkit.dockerfile.v0
<missing>      4 hours ago      RUN |3 DATADIR=/var/opt/gitlab CONFIG=/srv/g…   58.9MB    buildkit.dockerfile.v0
<missing>      4 hours ago      COPY / / # buildkit                             14.8MB    buildkit.dockerfile.v0
<missing>      4 hours ago      RUN |3 DATADIR=/var/opt/gitlab CONFIG=/srv/g…   0B        buildkit.dockerfile.v0
<missing>      4 hours ago      COPY /usr/local /usr/local # buildkit           9.35MB    buildkit.dockerfile.v0
<missing>      4 hours ago      COPY /usr/local/psql/lib/libpq.so* /usr/lib/…   1.08MB    buildkit.dockerfile.v0
<missing>      4 hours ago      COPY /usr/local/psql/bin/psql /usr/bin/ # bu…   808kB     buildkit.dockerfile.v0
<missing>      4 hours ago      COPY /usr/local/psql/bin/pg_* /usr/bin/ # bu…   2.48MB    buildkit.dockerfile.v0
<missing>      4 hours ago      COPY /usr/local/bin/gitlab-metrics-exporter …   13.9MB    buildkit.dockerfile.v0
<missing>      4 hours ago      COPY /usr/local/bin/gitlab-elasticsearch-ind…   37.4MB    buildkit.dockerfile.v0
<missing>      4 hours ago      RUN |3 DATADIR=/var/opt/gitlab CONFIG=/srv/g…   87.1MB    buildkit.dockerfile.v0
<missing>      4 hours ago      ARG GITLAB_USER=git                             0B        buildkit.dockerfile.v0
<missing>      4 hours ago      ARG CONFIG=/srv/gitlab/config                   0B        buildkit.dockerfile.v0
<missing>      4 hours ago      ARG DATADIR=/var/opt/gitlab                     0B        buildkit.dockerfile.v0
<missing>      4 hours ago      CMD ["irb"]                                     0B        buildkit.dockerfile.v0
<missing>      4 hours ago      RUN |3 RUBYGEMS_VERSION=3.4.10 BUNDLER_VERSI…   25.5MB    buildkit.dockerfile.v0
<missing>      4 hours ago      ARG RBREADLINE_VERSION=0.5.5                    0B        buildkit.dockerfile.v0
<missing>      4 hours ago      ARG BUNDLER_VERSION=2.4.6                       0B        buildkit.dockerfile.v0
<missing>      4 hours ago      ARG RUBYGEMS_VERSION=3.4.10                     0B        buildkit.dockerfile.v0
<missing>      4 hours ago      COPY /usr/etc/gemrc /usr/etc/gemrc # buildkit   45B       buildkit.dockerfile.v0
<missing>      4 hours ago      COPY /assets / # buildkit                       160MB     buildkit.dockerfile.v0
<missing>      4 hours ago      COPY shared/build-scripts/ /build-scripts # …   5.38kB    buildkit.dockerfile.v0
<missing>      4 hours ago      RUN /bin/sh -c apt-get update   && apt-get i…   38.6MB    buildkit.dockerfile.v0
<missing>      5 hours ago      ENTRYPOINT ["/scripts/entrypoint.sh"]           0B        buildkit.dockerfile.v0
<missing>      5 hours ago      ENV CONFIG_TEMPLATE_DIRECTORY=/etc              0B        buildkit.dockerfile.v0
<missing>      5 hours ago      COPY /gitlab-logger /usr/local/bin/gitlab-lo…   2.48MB    buildkit.dockerfile.v0
<missing>      5 hours ago      COPY /gomplate /usr/local/bin/gomplate # bui…   48.9MB    buildkit.dockerfile.v0
<missing>      5 hours ago      RUN /bin/sh -c chown -R 0:0 /scripts/ # buil…   2.01kB    buildkit.dockerfile.v0
<missing>      5 hours ago      COPY scripts/ /scripts # buildkit               2.01kB    buildkit.dockerfile.v0
<missing>      5 hours ago      RUN /bin/sh -c apt-get update   && apt-get i…   8.55MB    buildkit.dockerfile.v0
<missing>      5 hours ago      ENV LANG=C.UTF-8                                0B        buildkit.dockerfile.v0
<missing>      11 days ago      /bin/sh -c #(nop)  CMD ["bash"]                 0B        
<missing>      11 days ago      /bin/sh -c #(nop) ADD file:60911afdacfdc216e…   80.5MB    

# With Prophet
$ docker history 5735bf7e6d7f
IMAGE          CREATED          CREATED BY                                      SIZE      COMMENT
5735bf7e6d7f   49 minutes ago   /bin/sh -c #(nop)  CMD ["/scripts/process-wr…   0B        
baa82054d1e5   49 minutes ago   /bin/sh -c #(nop)  USER git:git                 0B        
f27b7807039a   49 minutes ago   |3 DOCUTILS_VERSION=0.19 GITLAB_USER=git PRO…   15.4kB    
363f875f6c44   49 minutes ago   /bin/sh -c #(nop) COPY dir:aa6cc28eeda61ae68…   1.38kB    
2757b337cea7   49 minutes ago   |3 DOCUTILS_VERSION=0.19 GITLAB_USER=git PRO…   23.4MB    
14016ac6ec0b   49 minutes ago   /bin/sh -c #(nop)  ENV SIDEKIQ_TIMEOUT=25       0B        
488cab5b6ff2   49 minutes ago   /bin/sh -c #(nop)  ENV SIDEKIQ_CONCURRENCY=25   0B        
ec27677d2b08   49 minutes ago   |3 DOCUTILS_VERSION=0.19 GITLAB_USER=git PRO…   299MB     
5bea57e348af   50 minutes ago   /bin/sh -c #(nop)  ENV PYTHONPATH=/usr/local…   0B        
a4eaaf7e63bb   50 minutes ago   /bin/sh -c #(nop) COPY dir:a00e0a49992a1ce8c…   96.8MB    
babe646e6c83   57 minutes ago   /bin/sh -c #(nop) COPY dir:4810d640c93b39579…   22.5kB    
87015728fa3b   57 minutes ago   /bin/sh -c #(nop)  ARG PROPHET_VERSION=1.1.2    0B        
a76c7ec0f8ce   57 minutes ago   /bin/sh -c #(nop)  ARG DOCUTILS_VERSION=0.19    0B        
1cc600a2a082   57 minutes ago   /bin/sh -c #(nop)  ARG GITLAB_USER=git          0B        
8a96a14f4e2b   4 hours ago      VOLUME [/var/opt/gitlab /var/log]               0B        buildkit.dockerfile.v0
<missing>      4 hours ago      ENV RAILS_ENV=production BOOTSNAP_CACHE_DIR=…   0B        buildkit.dockerfile.v0
<missing>      4 hours ago      RUN |3 DATADIR=/var/opt/gitlab CONFIG=/srv/g…   14kB      buildkit.dockerfile.v0
<missing>      4 hours ago      COPY scripts/ /scripts # buildkit               12kB      buildkit.dockerfile.v0
<missing>      4 hours ago      RUN |3 DATADIR=/var/opt/gitlab CONFIG=/srv/g…   12.1kB    buildkit.dockerfile.v0
<missing>      4 hours ago      COPY /var/opt/gitlab /var/opt/gitlab # build…   0B        buildkit.dockerfile.v0
<missing>      4 hours ago      COPY /srv/gitlab /srv/gitlab # buildkit         1.02GB    buildkit.dockerfile.v0
<missing>      4 hours ago      RUN |3 DATADIR=/var/opt/gitlab CONFIG=/srv/g…   334kB     buildkit.dockerfile.v0
<missing>      4 hours ago      RUN |3 DATADIR=/var/opt/gitlab CONFIG=/srv/g…   58.9MB    buildkit.dockerfile.v0
<missing>      4 hours ago      COPY / / # buildkit                             14.8MB    buildkit.dockerfile.v0
<missing>      4 hours ago      RUN |3 DATADIR=/var/opt/gitlab CONFIG=/srv/g…   0B        buildkit.dockerfile.v0
<missing>      4 hours ago      COPY /usr/local /usr/local # buildkit           9.35MB    buildkit.dockerfile.v0
<missing>      4 hours ago      COPY /usr/local/psql/lib/libpq.so* /usr/lib/…   1.08MB    buildkit.dockerfile.v0
<missing>      4 hours ago      COPY /usr/local/psql/bin/psql /usr/bin/ # bu…   808kB     buildkit.dockerfile.v0
<missing>      4 hours ago      COPY /usr/local/psql/bin/pg_* /usr/bin/ # bu…   2.48MB    buildkit.dockerfile.v0
<missing>      4 hours ago      COPY /usr/local/bin/gitlab-metrics-exporter …   13.9MB    buildkit.dockerfile.v0
<missing>      4 hours ago      COPY /usr/local/bin/gitlab-elasticsearch-ind…   37.4MB    buildkit.dockerfile.v0
<missing>      4 hours ago      RUN |3 DATADIR=/var/opt/gitlab CONFIG=/srv/g…   87.1MB    buildkit.dockerfile.v0
<missing>      4 hours ago      ARG GITLAB_USER=git                             0B        buildkit.dockerfile.v0
<missing>      4 hours ago      ARG CONFIG=/srv/gitlab/config                   0B        buildkit.dockerfile.v0
<missing>      4 hours ago      ARG DATADIR=/var/opt/gitlab                     0B        buildkit.dockerfile.v0
<missing>      4 hours ago      CMD ["irb"]                                     0B        buildkit.dockerfile.v0
<missing>      4 hours ago      RUN |3 RUBYGEMS_VERSION=3.4.10 BUNDLER_VERSI…   25.5MB    buildkit.dockerfile.v0
<missing>      4 hours ago      ARG RBREADLINE_VERSION=0.5.5                    0B        buildkit.dockerfile.v0
<missing>      4 hours ago      ARG BUNDLER_VERSION=2.4.6                       0B        buildkit.dockerfile.v0
<missing>      4 hours ago      ARG RUBYGEMS_VERSION=3.4.10                     0B        buildkit.dockerfile.v0
<missing>      4 hours ago      COPY /usr/etc/gemrc /usr/etc/gemrc # buildkit   45B       buildkit.dockerfile.v0
<missing>      4 hours ago      COPY /assets / # buildkit                       160MB     buildkit.dockerfile.v0
<missing>      4 hours ago      COPY shared/build-scripts/ /build-scripts # …   5.38kB    buildkit.dockerfile.v0
<missing>      4 hours ago      RUN /bin/sh -c apt-get update   && apt-get i…   38.6MB    buildkit.dockerfile.v0
<missing>      5 hours ago      ENTRYPOINT ["/scripts/entrypoint.sh"]           0B        buildkit.dockerfile.v0
<missing>      5 hours ago      ENV CONFIG_TEMPLATE_DIRECTORY=/etc              0B        buildkit.dockerfile.v0
<missing>      5 hours ago      COPY /gitlab-logger /usr/local/bin/gitlab-lo…   2.48MB    buildkit.dockerfile.v0
<missing>      5 hours ago      COPY /gomplate /usr/local/bin/gomplate # bui…   48.9MB    buildkit.dockerfile.v0
<missing>      5 hours ago      RUN /bin/sh -c chown -R 0:0 /scripts/ # buil…   2.01kB    buildkit.dockerfile.v0
<missing>      5 hours ago      COPY scripts/ /scripts # buildkit               2.01kB    buildkit.dockerfile.v0
<missing>      5 hours ago      RUN /bin/sh -c apt-get update   && apt-get i…   8.55MB    buildkit.dockerfile.v0
<missing>      5 hours ago      ENV LANG=C.UTF-8                                0B        buildkit.dockerfile.v0
<missing>      11 days ago      /bin/sh -c #(nop)  CMD ["bash"]                 0B        
<missing>      11 days ago      /bin/sh -c #(nop) ADD file:60911afdacfdc216e…   80.5MB    

Related issues

https://gitlab.com/gitlab-org/gitlab/-/issues/404416

Checklist

See Definition of done.

For anything in this list which will not be completed, please provide a reason in the MR discussion

Required

  • Merge Request Title, and Description are up to date, accurate, and descriptive
  • MR targeting the appropriate branch
  • MR has a green pipeline on GitLab.com

Expected (please provide an explanation if not completing)

  • Test plan indicating conditions for success has been posted and passes
  • Documentation created/updated
  • Integration tests added to GitLab QA
  • The impact any change in container size has should be evaluated
Edited by Sean McGivern

Merge request reports