Issue "divide by zero" during prometheus-upgrade after GitLab migration from 11.2.3 to 11.5.0

Hello,

I have made the migration of my GitLab installation from 11.2.3 to 11.5.0 via the command :

docker-compose down && docker-compose pull && docker-compose up -d

After that I have launched the command prometheus-upgrade but I have the following errors. Could you help me to solve this issue please to migrate to the last Prometheus version ?

docker exec -ti $(docker ps --filter name=gitlab --format '{{.Names}}') gitlab-ctl prometheus-upgrade
Converting existing data to new format is a time consuming process and can take hours.
If you prefer not to migrate existing data, press Ctrl-C now and re-run the command with --skip-data-migration flag.
Waiting for 30 seconds for input.

Please hit Ctrl-C now if you want to cancel the operation.
..............................
Stopping prometheus for upgrade
ok: down: prometheus: 0s, normally up
Migrating data
time="2018-11-26T11:05:43+01:00" level=info msg="Loading series map and head chunks..." source="storage.go:428"
time="2018-11-26T11:05:43+01:00" level=info msg="25885 series loaded." source="storage.go:439"
time="2018-11-26T11:05:45+01:00" level=warning msg="Storage has entered rushed mode." chunksToPersist=5665 memoryChunks=17422 source="storage.go:1879" urgencyScore=1
time="2018-11-26T11:05:46+01:00" level=info msg="Completed initial partial maintenance sweep through 1008 in-memory fingerprints in 1.991043459s." source="storage.go:1408"
time="2018-11-26T11:06:27+01:00" level=info msg="Completed full maintenance sweep through 25331 in-memory fingerprints in 40.350651975s." source="storage.go:1408"
time="2018-11-26T11:06:34+01:00" level=info msg="Stopping local storage..." source="storage.go:465"
time="2018-11-26T11:06:34+01:00" level=info msg="Stopping maintenance loop..." source="storage.go:467"
time="2018-11-26T11:06:34+01:00" level=info msg="Completed full maintenance sweep through 11303 in-memory fingerprints in 4.899398942s." source="storage.go:1408"
time="2018-11-26T11:06:34+01:00" level=info msg="Maintenance loop stopped." source="storage.go:1471"
time="2018-11-26T11:06:34+01:00" level=info msg="Stopping series quarantining..." source="storage.go:471"
time="2018-11-26T11:06:34+01:00" level=info msg="Series quarantining stopped." source="storage.go:1919"
time="2018-11-26T11:06:34+01:00" level=info msg="Stopping chunk eviction..." source="storage.go:475"
time="2018-11-26T11:06:34+01:00" level=info msg="Chunk eviction stopped." source="storage.go:1166"
time="2018-11-26T11:06:34+01:00" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:633"
time="2018-11-26T11:06:34+01:00" level=info msg="Done checkpointing in-memory metrics and chunks in 3.496029ms." source="persistence.go:665"
time="2018-11-26T11:06:34+01:00" level=info msg="Checkpointing fingerprint mappings..." source="persistence.go:1526"
time="2018-11-26T11:06:34+01:00" level=info msg="Done checkpointing fingerprint mappings in 8.309322ms." source="persistence.go:1549"
time="2018-11-26T11:06:34+01:00" level=info msg="Local storage stopped." source="storage.go:492"
 1 / 1440    0.07% 04m46spanic: runtime error: integer divide by zero

goroutine 215 [running]:
gitlab.com/gitlab-org/prometheus-storage-migrator/vendor/github.com/prometheus/prometheus/storage/local/chunk.deltaEncodedChunk.Len(...)
        /var/cache/omnibus/src/prometheus-storage-migrator/src/gitlab.com/gitlab-org/prometheus-storage-migrator/vendor/github.com/prometheus/prometheus/storage/local/chunk/delta.go:313
gitlab.com/gitlab-org/prometheus-storage-migrator/vendor/github.com/prometheus/prometheus/storage/local/chunk.(*deltaEncodedChunk).NewIterator(0xc422dd5140, 0xc422de6011, 0x400)
        /var/cache/omnibus/src/prometheus-storage-migrator/src/gitlab.com/gitlab-org/prometheus-storage-migrator/vendor/github.com/prometheus/prometheus/storage/local/chunk/delta.go:194 +0x258
main.(*seriesFileTracker).migrate(0xc424079da0, 0x167023cc5c0, 0x167024a815f, 0x7ffe488b87e4, 0x1f, 0x9ccb40, 0xc422dd5120, 0x0, 0x0)
        /var/cache/omnibus/src/prometheus-storage-migrator/src/gitlab.com/gitlab-org/prometheus-storage-migrator/series_migrator.go:99 +0x887
main.(*storageMigrator).migrateStep.func2(0x8, 0x987fe0)
        /var/cache/omnibus/src/prometheus-storage-migrator/src/gitlab.com/gitlab-org/prometheus-storage-migrator/series_migrator.go:335 +0x148
gitlab.com/gitlab-org/prometheus-storage-migrator/vendor/golang.org/x/sync/errgroup.(*Group).Go.func1(0xc42326b040, 0xc4201974a0)
        /var/cache/omnibus/src/prometheus-storage-migrator/src/gitlab.com/gitlab-org/prometheus-storage-migrator/vendor/golang.org/x/sync/errgroup/errgroup.go:58 +0x57
created by gitlab.com/gitlab-org/prometheus-storage-migrator/vendor/golang.org/x/sync/errgroup.(*Group).Go
        /var/cache/omnibus/src/prometheus-storage-migrator/src/gitlab.com/gitlab-org/prometheus-storage-migrator/vendor/golang.org/x/sync/errgroup/errgroup.go:55 +0x66
Error running command: /opt/gitlab/embedded/bin/prometheus-storage-migrator -v1-path=/var/opt/gitlab/prometheus/data -v2-path=/var/opt/gitlab/prometheus/data2
ERROR: time="2018-11-26T11:05:43+01:00" level=info msg="Loading series map and head chunks..." source="storage.go:428"
time="2018-11-26T11:05:43+01:00" level=info msg="25885 series loaded." source="storage.go:439"
time="2018-11-26T11:05:45+01:00" level=warning msg="Storage has entered rushed mode." chunksToPersist=5665 memoryChunks=17422 source="storage.go:1879" urgencyScore=1
time="2018-11-26T11:05:46+01:00" level=info msg="Completed initial partial maintenance sweep through 1008 in-memory fingerprints in 1.991043459s." source="storage.go:1408"
time="2018-11-26T11:06:27+01:00" level=info msg="Completed full maintenance sweep through 25331 in-memory fingerprints in 40.350651975s." source="storage.go:1408"
time="2018-11-26T11:06:34+01:00" level=info msg="Stopping local storage..." source="storage.go:465"
time="2018-11-26T11:06:34+01:00" level=info msg="Stopping maintenance loop..." source="storage.go:467"
time="2018-11-26T11:06:34+01:00" level=info msg="Completed full maintenance sweep through 11303 in-memory fingerprints in 4.899398942s." source="storage.go:1408"
time="2018-11-26T11:06:34+01:00" level=info msg="Maintenance loop stopped." source="storage.go:1471"
time="2018-11-26T11:06:34+01:00" level=info msg="Stopping series quarantining..." source="storage.go:471"
time="2018-11-26T11:06:34+01:00" level=info msg="Series quarantining stopped." source="storage.go:1919"
time="2018-11-26T11:06:34+01:00" level=info msg="Stopping chunk eviction..." source="storage.go:475"
time="2018-11-26T11:06:34+01:00" level=info msg="Chunk eviction stopped." source="storage.go:1166"
time="2018-11-26T11:06:34+01:00" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:633"
time="2018-11-26T11:06:34+01:00" level=info msg="Done checkpointing in-memory metrics and chunks in 3.496029ms." source="persistence.go:665"
time="2018-11-26T11:06:34+01:00" level=info msg="Checkpointing fingerprint mappings..." source="persistence.go:1526"
time="2018-11-26T11:06:34+01:00" level=info msg="Done checkpointing fingerprint mappings in 8.309322ms." source="persistence.go:1549"
time="2018-11-26T11:06:34+01:00" level=info msg="Local storage stopped." source="storage.go:492"
panic: runtime error: integer divide by zero

goroutine 215 [running]:
gitlab.com/gitlab-org/prometheus-storage-migrator/vendor/github.com/prometheus/prometheus/storage/local/chunk.deltaEncodedChunk.Len(...)
        /var/cache/omnibus/src/prometheus-storage-migrator/src/gitlab.com/gitlab-org/prometheus-storage-migrator/vendor/github.com/prometheus/prometheus/storage/local/chunk/delta.go:313
gitlab.com/gitlab-org/prometheus-storage-migrator/vendor/github.com/prometheus/prometheus/storage/local/chunk.(*deltaEncodedChunk).NewIterator(0xc422dd5140, 0xc422de6011, 0x400)
        /var/cache/omnibus/src/prometheus-storage-migrator/src/gitlab.com/gitlab-org/prometheus-storage-migrator/vendor/github.com/prometheus/prometheus/storage/local/chunk/delta.go:194 +0x258
main.(*seriesFileTracker).migrate(0xc424079da0, 0x167023cc5c0, 0x167024a815f, 0x7ffe488b87e4, 0x1f, 0x9ccb40, 0xc422dd5120, 0x0, 0x0)
        /var/cache/omnibus/src/prometheus-storage-migrator/src/gitlab.com/gitlab-org/prometheus-storage-migrator/series_migrator.go:99 +0x887
main.(*storageMigrator).migrateStep.func2(0x8, 0x987fe0)
        /var/cache/omnibus/src/prometheus-storage-migrator/src/gitlab.com/gitlab-org/prometheus-storage-migrator/series_migrator.go:335 +0x148
gitlab.com/gitlab-org/prometheus-storage-migrator/vendor/golang.org/x/sync/errgroup.(*Group).Go.func1(0xc42326b040, 0xc4201974a0)
        /var/cache/omnibus/src/prometheus-storage-migrator/src/gitlab.com/gitlab-org/prometheus-storage-migrator/vendor/golang.org/x/sync/errgroup/errgroup.go:58 +0x57
created by gitlab.com/gitlab-org/prometheus-storage-migrator/vendor/golang.org/x/sync/errgroup.(*Group).Go
        /var/cache/omnibus/src/prometheus-storage-migrator/src/gitlab.com/gitlab-org/prometheus-storage-migrator/vendor/golang.org/x/sync/errgroup/errgroup.go:55 +0x66
Migration failed. Restoring data and restarting prometheus.
ok: run: prometheus: (pid 3363) 0s

I have also this kind of message :

2018-11-26_09:56:27.31397 time="2018-11-26T09:56:27Z" level=warning msg="Recovered metric gitlab_sql_duration_seconds_bucket{action="stage.json", controller="Projects::PipelinesController", instance="localhost:8080", job="gitlab-unicorn", le="2.5"}, fingerprint 2ff9b1c3742f16df: all 13 chunks recovered from series file." source="crashrecovery.go:354"
2018-11-26_09:56:27.31480 time="2018-11-26T09:56:27Z" level=warning msg="Recovered metric gitlab_transaction_new_redis_connections_total{action="update", controller="Projects::RunnersController", instance="localhost:8080", job="gitlab-unicorn"}, fingerprint 2f48dc0e342c9796: all 10 chunks recovered from series file." source="crashrecovery.go:354"
2018-11-26_09:56:27.31543 time="2018-11-26T09:56:27Z" level=warning msg="Recovered metric gitlab_cache_operation_duration_seconds_bucket{action="index", controller="Projects::EnvironmentsController", instance="localhost:8080", job="gitlab-unicorn", le="0.01", operation="read"}, fingerprint 2f6f0267a55ba60b: all 12 chunks recovered from series file." source="crashrecovery.go:354"
2018-11-26_09:56:27.31613 time="2018-11-26T09:56:27Z" level=warning msg="Recovered metric gitlab_transaction_duration_seconds_bucket{action="projects", controller="GroupsController", instance="localhost:8080", job="gitlab-unicorn", le="0.5"}, fingerprint 2f790f09b0575cf8: all 13 chunks recovered from series file." source="crashrecovery.go:354"
2018-11-26_09:56:27.31670 time="2018-11-26T09:56:27Z" level=warning msg="Recovered metric gitlab_transaction_duration_seconds_bucket{action="projects", controller="GroupsController", instance="localhost:8080", job="gitlab-unicorn", le="0.1"}, fingerprint 2f791309b057a7c4: all 13 chunks recovered from series file." source="crashrecovery.go:354"
2018-11-26_09:56:27.31727 time="2018-11-26T09:56:27Z" level=warning msg="Recovered metric gitlab_cache_operation_duration_seconds_bucket{action="new", controller="Projects::PipelinesController", instance="localhost:8080", job="gitlab-unicorn", le="1", operation="read"}, fingerprint 2f70047680c0eaf8: all 12 chunks recovered from series file." source="crashrecovery.go:354"
2018-11-26_09:56:27.31800 time="2018-11-26T09:56:27Z" level=warning msg="Recovered metric gitlab_cache_operation_duration_seconds_bucket{action="builds", controller="Projects::PipelinesController", instance="localhost:8080", job="gitlab-unicorn", le="0.01", operation="read"}, fingerprint 2f747f997e93261b: all 12 chunks recovered from series file." source="crashrecovery.go:354"
2018-11-26_09:56:27.31851 time="2018-11-26T09:56:27Z" level=warning msg="Recovered metric gitlab_cache_operation_duration_seconds_bucket{action="update.json", controller="Boards::IssuesController", instance="localhost:8080", job="gitlab-unicorn", le="0.1", operation="read"}, fingerprint 2f42dcbf9f9f5f5f: all 12 chunks recovered from series file." source="crashrecovery.go:354"
2018-11-26_09:56:27.31885 time="2018-11-26T09:56:27Z" level=warning msg="Recovered metric gitlab_transaction_duration_seconds_bucket{action="new", controller="Projects::BranchesController", instance="localhost:8080", job="gitlab-unicorn", le="2.5"}, fingerprint 2f560dd02911aa58: all 8 chunks recovered from series file." source="crashrecovery.go:354"
2018-11-26_09:56:27.31933 time="2018-11-26T09:56:27Z" level=warning msg="Recovered metric gitlab_cache_operation_duration_seconds_count{action="activity", controller="ProjectsController", instance="localhost:8080", job="gitlab-unicorn", operation="write"}, fingerprint 2f5e3fcd9ccf28c4: all 8 chunks recovered from series file." source="crashrecovery.go:354"
2018-11-26_09:56:27.31976 time="2018-11-26T09:56:27Z" level=warning msg="Recovered metric gitlab_cache_operation_duration_seconds_bucket{action="pipelines.json", controller="Projects::MergeRequestsController", instance="localhost:8080", job="gitlab-unicorn", le="0.01", operation="read"}, fingerprint 2f1839ec5aa4685a: all 8 chunks recovered from series file." source="crashrecovery.go:354"

I can't loose my previous Prometheus data...

Thank you very much for your help to solve this issue

Edited Nov 26, 2018 by craph
Assignee Loading
Time tracking Loading