Panic and invalid memory access, followed by excessive logs/retries
Sometimes when DLE goes to do a physical restore, it fails with a panic and a nil pointer dereference (logs below). It repeatedly tries to do the same invalid memory reference. This causes our logs to fill up and the host machine to run out of disk space, no matter how much disk space we add (because it retries every couple of seconds).
So, there are a few things it would be great to look into here:
- Fixing the invalid memory address or nil pointer dereference
- Adding an option to cap log files by size, not just age
- Greater resilience for physical snapshot failures
dblab_server | 2023/07/30 19:56:10 util.go:37: [DEBUG] Response: {physical failed <nil> <nil> map[refresh_failed:{error Pool to perform data refresh not found 2023-07-30 16:58:28.881529049 +0000 UTC m=+1.913180664 1}] <nil>}
dblab_server | 2023/07/30 19:56:10 util.go:37: [DEBUG] Response: []
dblab_server | 2023/07/30 19:56:10 http: panic serving xxx.xxx.xxx.xxx:xxxxx: runtime error: invalid memory address or nil pointer dereference
dblab_server | goroutine 333 [running]:
dblab_server | net/http.(*conn).serve.func1()
dblab_server | /usr/local/go/src/net/http/server.go:1825 +0xbf
dblab_server | panic({0xe8a7e0, 0x17ba8f0})
dblab_server | /usr/local/go/src/runtime/panic.go:844 +0x258
dblab_server | gitlab.com/postgres-ai/database-lab/v3/internal/retrieval.(*Retrieval).reportContainerSyncStatus(0xc00020a0d0, {0x119d248, 0xc000094000}, {0xc00026c3a0?, 0x2?})
dblab_server | /builds/postgres-ai/database-lab/engine/internal/retrieval/retrieval.go:802 +0x205
dblab_server | gitlab.com/postgres-ai/database-lab/v3/internal/retrieval.(*Retrieval).ReportSyncStatus(0xc00020a0d0, {0x119d248, 0xc000094000})
dblab_server | /builds/postgres-ai/database-lab/engine/internal/retrieval/retrieval.go:768 +0x490
dblab_server | gitlab.com/postgres-ai/database-lab/v3/internal/srv.(*Server).instanceStatus(0xc000328000)
dblab_server | /builds/postgres-ai/database-lab/engine/internal/srv/server.go:135 +0x638
dblab_server | gitlab.com/postgres-ai/database-lab/v3/internal/srv.(*Server).getInstanceStatus(0xf26dc0?, {0x119c948, 0xc0002522a0}, 0xc0002b6340?)
dblab_server | /builds/postgres-ai/database-lab/engine/internal/srv/routes.go:33 +0x28
dblab_server | gitlab.com/postgres-ai/database-lab/v3/internal/srv/mw.(*Auth).Authorized.func1({0x119c948, 0xc0002522a0}, 0xc00026bc00)
dblab_server | /builds/postgres-ai/database-lab/engine/internal/srv/mw/auth.go:44 +0xae
dblab_server | net/http.HandlerFunc.ServeHTTP(0xc00026bb00?, {0x119c948?, 0xc0002522a0?}, 0x17be2e0?)
dblab_server | /usr/local/go/src/net/http/server.go:2084 +0x2f
dblab_server | github.com/gorilla/mux.(*Router).ServeHTTP(0xc0002e0300, {0x119c948, 0xc0002522a0}, 0xc00026ba00)
dblab_server | /go/pkg/mod/github.com/gorilla/mux@v1.8.0/mux.go:210 +0x1cf
dblab_server | gitlab.com/postgres-ai/database-lab/v3/internal/srv/mw.Logging.func1({0x119c948, 0xc0002522a0}, 0xc00026ba00)
dblab_server | /builds/postgres-ai/database-lab/engine/internal/srv/mw/logging.go:17 +0xed
dblab_server | net/http.HandlerFunc.ServeHTTP(0x0?, {0x119c948?, 0xc0002522a0?}, 0xc0000eb400?)
dblab_server | /usr/local/go/src/net/http/server.go:2084 +0x2f
dblab_server | net/http.serverHandler.ServeHTTP({0xc000244bd0?}, {0x119c948, 0xc0002522a0}, 0xc00026ba00)
dblab_server | /usr/local/go/src/net/http/server.go:2916 +0x43b
dblab_server | net/http.(*conn).serve(0xc0001daaa0, {0x119d2b8, 0xc0002d1d40})
dblab_server | /usr/local/go/src/net/http/server.go:1966 +0x5d7
dblab_server | created by net/http.(*Server).Serve
dblab_server | /usr/local/go/src/net/http/server.go:3071 +0x4db
Edited by TLG