DMS cannot connect to oracle
Summary
DMS fails to initiate an ML on GPU job when triggered by the Service provider dashboard webapp.
Webconsole has a 503 error cannot connect to oracle
Steps to reproduce
From a machine running DMS 0.4.69 and the SPD 0.1.1 Ensure the machine has DHT peers that have a GPU that has been onboarded. open the service provider dashboard and run a GPU ML job select the default cpu job (https://gitlab.com/nunet/ml-on-gpu/ml-on-gpu-service/-/raw/develop/examples/pytorch/cifar-10_checkpointed.py) specify a low job with a 120 minute runtime connect wallet and click next. specify max tokens amount as 10 check the recaptcha click submit
NOTE it doesn't ask to sign transaction, get a 503 error in the webconsole
What is the current bug behavior?
When running the compute provider webapp to initiate a GPU ML job the compute provider fails to execute the job.
The webapp just stays busy (shows its busy animation on the submit button) with no error message.
What is the expected correct behavior?
Job should be triggered and web app should confirm receipt of job / status
Relevant logs and/or screenshots
please see screen shots of job config and dev console output from the browser
ran the process twice both logs attached.
May 03 15:27:58 sams-asus nunet-dms[3446]: * [/ip4/220.78.153.167/tcp/65306] dial backoff --- peer: QmSFgAXu7vsdhCihZGv5hVbUmJtsSQPu9jJ1LCi9TywxBK {"package": "internal"}
May 03 15:28:54 sams-asus nunet-dms[3446]: 2023-05-03T15:28:54.033+0700 ERROR machines/handlers.go:196 websocket: close 1001 (going away) {"package": "internal"}
May 03 15:28:54 sams-asus nunet-dms[3446]: gitlab.com/nunet/device-management-service/libp2p/machines.listenForDeploymentStatus
May 03 15:28:54 sams-asus nunet-dms[3446]: /tmp/builds/uUvV2XC1/0/nunet/device-management-service/libp2p/machines/handlers.go:196
May 03 15:28:55 sams-asus nunet-dms[3446]: [GIN] 2023/05/03 - 15:28:55 | 200 | 18.018599ms | ::1 | GET "/api/v1/run/deploy"
May 03 15:29:17 sams-asus nunet-dms[3446]: * [/ip4/91.112.68.174/udp/9000/quic] dial backoff
May 03 15:29:17 sams-asus nunet-dms[3446]: * [/ip4/91.112.68.174/udp/1024/quic] dial backoff
May 03 15:29:17 sams-asus nunet-dms[3446]: * [/ip4/91.112.68.174/tcp/1024] dial backoff --- peer: QmVoV75iKrJtuU5LujbKQUdbF91HQEGnQZDqnjzt5gSyBp {"package": "internal"}
May 03 15:29:51 sams-asus nunet-dms[3446]: * [/ip4/65.109.11.100/udp/9000/quic/p2p/QmTuuV1DG2F94LS1qTg47oVRhU7ptLAfnfzsjvuE3xNhif/p2p-circuit] dial backoff
May 03 15:29:51 sams-asus nunet-dms[3446]: * [/ip4/65.109.11.100/tcp/9000/p2p/QmTuuV1DG2F94LS1qTg47oVRhU7ptLAfnfzsjvuE3xNhif/p2p-circuit] dial backoff --- peer: QmWrzsqxjqrxJ65zZ93hMFFhkjro9v2Fqpj1dovhRBUkwS {"package": "internal"}
May 03 15:30:10 sams-asus nunet-dms[3446]: 2023-05-03T15:30:10.517+0700 ERROR machines/handlers.go:196 websocket: close 1001 (going away) {"package": "internal"}
May 03 15:30:10 sams-asus nunet-dms[3446]: gitlab.com/nunet/device-management-service/libp2p/machines.listenForDeploymentStatus
May 03 15:30:10 sams-asus nunet-dms[3446]: /tmp/builds/uUvV2XC1/0/nunet/device-management-service/libp2p/machines/handlers.go:196
May 03 15:30:11 sams-asus nunet-dms[3446]: [GIN] 2023/05/03 - 15:30:11 | 200 | 202.2µs | ::1 | GET "/api/v1/run/deploy"
May 03 15:31:02 sams-asus nunet-dms[3446]: 2023-05-03T15:31:02.636+0700 INFO machines/handlers.go:65 estimated ntx price 3.2736 {"package": "internal"}
May 03 15:31:18 sams-asus nunet-dms[3446]: 2023-05-03T15:31:18.946+0700 ERROR libp2p/network.go:132 failed to read pong message: %!w(*errors.errorString=&{stream reset}) {"package": "internal"}
May 03 15:31:18 sams-asus nunet-dms[3446]: gitlab.com/nunet/device-management-service/libp2p.PingPeer
May 03 15:31:18 sams-asus nunet-dms[3446]: /tmp/builds/uUvV2XC1/0/nunet/device-management-service/libp2p/network.go:132
May 03 15:31:18 sams-asus nunet-dms[3446]: gitlab.com/nunet/device-management-service/libp2p/machines.HandleRequestService
May 03 15:31:18 sams-asus nunet-dms[3446]: /tmp/builds/uUvV2XC1/0/nunet/device-management-service/libp2p/machines/handlers.go:82
May 03 15:31:18 sams-asus nunet-dms[3446]: github.com/gin-gonic/gin.(*Context).Next
May 03 15:31:18 sams-asus nunet-dms[3446]: /root/go/pkg/mod/github.com/gin-gonic/gin@v1.8.2/context.go:173
May 03 15:31:18 sams-asus nunet-dms[3446]: go.opentelemetry.io/contrib/instrumentation/github.com/gin-gonic/gin/otelgin.Middleware.func1
May 03 15:31:18 sams-asus nunet-dms[3446]: /root/go/pkg/mod/go.opentelemetry.io/contrib/instrumentation/github.com/gin-gonic/gin/otelgin@v0.39.0/gintrace.go:90
May 03 15:31:18 sams-asus nunet-dms[3446]: github.com/gin-gonic/gin.(*Context).Next
May 03 15:31:18 sams-asus nunet-dms[3446]: /root/go/pkg/mod/github.com/gin-gonic/gin@v1.8.2/context.go:173
May 03 15:31:18 sams-asus nunet-dms[3446]: github.com/gin-gonic/gin.CustomRecoveryWithWriter.func1
May 03 15:31:18 sams-asus nunet-dms[3446]: /root/go/pkg/mod/github.com/gin-gonic/gin@v1.8.2/recovery.go:101
May 03 15:31:18 sams-asus nunet-dms[3446]: github.com/gin-gonic/gin.(*Context).Next
May 03 15:31:18 sams-asus nunet-dms[3446]: /root/go/pkg/mod/github.com/gin-gonic/gin@v1.8.2/context.go:173
May 03 15:31:18 sams-asus nunet-dms[3446]: github.com/gin-gonic/gin.LoggerWithConfig.func1
May 03 15:31:18 sams-asus nunet-dms[3446]: /root/go/pkg/mod/github.com/gin-gonic/gin@v1.8.2/logger.go:240
May 03 15:31:18 sams-asus nunet-dms[3446]: github.com/gin-gonic/gin.(*Context).Next
May 03 15:31:18 sams-asus nunet-dms[3446]: /root/go/pkg/mod/github.com/gin-gonic/gin@v1.8.2/context.go:173
May 03 15:31:18 sams-asus nunet-dms[3446]: github.com/gin-gonic/gin.(*Engine).handleHTTPRequest
May 03 15:31:18 sams-asus nunet-dms[3446]: /root/go/pkg/mod/github.com/gin-gonic/gin@v1.8.2/gin.go:616
May 03 15:31:18 sams-asus nunet-dms[3446]: github.com/gin-gonic/gin.(*Engine).ServeHTTP
May 03 15:31:18 sams-asus nunet-dms[3446]: /root/go/pkg/mod/github.com/gin-gonic/gin@v1.8.2/gin.go:572
May 03 15:31:18 sams-asus nunet-dms[3446]: net/http.serverHandler.ServeHTTP
May 03 15:31:18 sams-asus nunet-dms[3446]: /usr/local/go/src/net/http/server.go:2947
May 03 15:31:18 sams-asus nunet-dms[3446]: net/http.(*conn).serve
May 03 15:31:18 sams-asus nunet-dms[3446]: /usr/local/go/src/net/http/server.go:1991
May 03 15:31:20 sams-asus nunet-dms[3446]: 2023-05-03T15:31:20.950+0700 DEBUG machines/handlers.go:106 compute provider: %!(EXTRA models.PeerData={Qmehd5L8gQDRjZp4EnxLzJ1m9uDRGCUk7ac1Z24UQMsN9K true false [{unknown 0 0}] addr_test1qp2fs6lrtqrmsdvuyvqkktyk75n4k0v0g9fvptr4rswy26lpvvsd4j77vfl0m8qge95d0fc6ajc2wmfctf3rtg7y8t9splatvm {1 17000 0 15000 0 3 0 0} [] 1683102495}) {"package": "internal"}
May 03 15:31:20 sams-asus nunet-dms[3446]: 2023-05-03T15:31:20.950+0700 INFO machines/handlers.go:109 sending fund contract request to oracle {"package": "internal"}
May 03 15:31:20 sams-asus nunet-dms[3446]: 2023-05-03T15:31:20.955+0700 INFO oracle/oracle.go:69 sending funding request to oracle {"package": "oracle"}
May 03 15:31:21 sams-asus nunet-dms[3446]: 2023-05-03T15:31:21.955+0700 INFO oracle/oracle.go:72 funding request failed %vrpc error: code = DeadlineExceeded desc = context deadline exceeded {"package": "oracle"}
May 03 15:31:21 sams-asus nunet-dms[3446]: 2023-05-03T15:31:21.955+0700 INFO machines/handlers.go:112 sending fund contract request to oracle failed {"package": "internal"}
May 03 15:31:21 sams-asus nunet-dms[3446]: [GIN] 2023/05/03 - 15:31:21 | 503 | 19.347991966s | ::1 | POST "/api/v1/run/request-service"
May 03 15:31:57 sams-asus nunet-dms[3446]: * [/ip4/65.109.11.100/tcp/9000/p2p/QmTuuV1DG2F94LS1qTg47oVRhU7ptLAfnfzsjvuE3xNhif/p2p-circuit] error opening relay circuit: NO_RESERVATION (204)
May 03 15:31:57 sams-asus nunet-dms[3446]: * [/ip4/65.109.11.100/udp/9000/quic/p2p/QmTuuV1DG2F94LS1qTg47oVRhU7ptLAfnfzsjvuE3xNhif/p2p-circuit] concurrent active dial through the same relay failed with a protocol error --- peer: QmQyGQT1vQAAwGEXocp58gkqD6Tbp2FH4qDEPgbpA9YWL4 {"package": "internal"}
May 03 15:32:02 sams-asus nunet-dms[3446]: * [/ip4/65.109.11.100/udp/9000/quic-v1/p2p/QmTuuV1DG2F94LS1qTg47oVRhU7ptLAfnfzsjvuE3xNhif/p2p-circuit] dial backoff
May 03 15:32:02 sams-asus nunet-dms[3446]: * [/ip4/65.109.11.100/tcp/9000/p2p/QmTuuV1DG2F94LS1qTg47oVRhU7ptLAfnfzsjvuE3xNhif/p2p-circuit] dial backoff --- peer: QmRFJEtatFH8H3UeLD9d5ps9S3PsEuk14icCp8VkDtYN74 {"package": "internal"}
May 03 15:32:21 sams-asus nunet-dms[3446]: * [/ip4/200.155.164.246/tcp/23864] dial backoff
May 03 15:32:21 sams-asus nunet-dms[3446]: * [/ip4/200.170.240.174/tcp/18393] dial backoff
May 03 15:32:21 sams-asus nunet-dms[3446]: * [/ip4/200.170.240.174/tcp/23864] dial backoff
May 03 15:32:21 sams-asus nunet-dms[3446]: * [/ip4/200.170.240.174/tcp/61486] dial backoff
May 03 15:32:21 sams-asus nunet-dms[3446]: * [/ip4/200.155.164.246/tcp/61486] dial backoff
May 03 15:32:21 sams-asus nunet-dms[3446]: * [/ip4/200.155.164.246/tcp/18393] dial backoff
May 03 15:32:21 sams-asus nunet-dms[3446]: * [/ip4/65.109.11.100/udp/9000/quic-v1/p2p/QmTuuV1DG2F94LS1qTg47oVRhU7ptLAfnfzsjvuE3xNhif/p2p-circuit] dial backoff
May 03 15:32:21 sams-asus nunet-dms[3446]: * [/ip4/65.109.11.100/tcp/9000/p2p/QmTuuV1DG2F94LS1qTg47oVRhU7ptLAfnfzsjvuE3xNhif/p2p-circuit] dial backoff --- peer: QmRLmeBzhaHhP6DtwsYz1bwA9ikBeMjGmu66y9wA6yZ5Z1 {"package": "internal"}
May 03 15:32:26 sams-asus nunet-dms[3446]: * [/ip4/220.78.153.167/udp/51878/quic] dial backoff
May 03 15:32:26 sams-asus nunet-dms[3446]: * [/ip4/220.78.153.167/tcp/65306] dial backoff --- peer: QmSFgAXu7vsdhCihZGv5hVbUmJtsSQPu9jJ1LCi9TywxBK {"package": "internal"}
### Version number of components and SO (if applicable)
DMS 0.4.69
SPD 0.1.1
### Possible fixes
Unsure
