Running inference test takes too long

Hello,

I am trying just to perform an inference test and it takes really long to load the model and the dynamic libraries and actually do any inference. Is this a known issue? Am I missing anything in my config?

I attach here the code that I am running (notice that I had to include a delay between model load and actual code inference because it took too long for loading the model).

HW: RTX 3070 Ti.

Output:

` Using TensorFlow backend. 2022-03-06 07:35:01.444229: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_100.dll

2022-03-06 07:35:02.405969: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library nvcuda.dll

2022-03-06 07:35:02.422482: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties:

name: NVIDIA GeForce RTX 3070 Ti major: 8 minor: 6 memoryClockRate(GHz): 1.785 pciBusID: 0000:09:00.0

2022-03-06 07:35:02.422579: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_100.dll

2022-03-06 07:35:02.424077: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_100.dll

2022-03-06 07:35:02.425574: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_100.dll

2022-03-06 07:35:02.426198: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_100.dll

2022-03-06 07:35:02.428579: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_100.dll

2022-03-06 07:35:02.430030: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_100.dll

2022-03-06 07:35:02.434150: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll

2022-03-06 07:35:02.434221: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0

2022-03-06 07:35:02.434827: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2

2022-03-06 07:35:02.437029: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties:

name: NVIDIA GeForce RTX 3070 Ti major: 8 minor: 6 memoryClockRate(GHz): 1.785 pciBusID: 0000:09:00.0

2022-03-06 07:35:02.437126: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_100.dll

2022-03-06 07:35:02.437188: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_100.dll

2022-03-06 07:35:02.437247: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_100.dll

2022-03-06 07:35:02.437313: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_100.dll

2022-03-06 07:35:02.437374: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_100.dll

2022-03-06 07:35:02.437434: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_100.dll

2022-03-06 07:35:02.437495: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll

2022-03-06 07:35:02.437568: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0

2022-03-06 07:35:51.772187: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix:

2022-03-06 07:35:51.772255: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165] 0

2022-03-06 07:35:51.772291: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0: N

2022-03-06 07:35:51.772409: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 6668 MB memory) -> physical GPU (device: 0, name: NVIDIA GeForce RTX 3070 Ti, pci bus id: 0000:09:00.0, compute capability: 8.6)

WARNING:tensorflow:From C:\src\poly-yolo\venv\lib\site-packages\tensorflow_core\python\ops\resource_variable_ops.py:1630: calling BaseResourceVariable.init (from tensorflow.python.ops.resource_variable_ops) with constraint is deprecated and will be removed in a future version. Instructions for updating: If using Keras pass *_constraint arguments to layers.

models/poly_yolo.h5 model, anchors, and classes loaded.

WARNING:tensorflow:From C:\src\poly-yolo\venv\lib\site-packages\tensorflow_core\python\ops\array_ops.py:1475: where (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version. Instructions for updating: Use tf.where in 2.0, which has the same broadcast rule as np.where `

You can see between the two bolded lines that it takes around 50 secs. to continue.

Thank you very much in advance!

Best regards,

Antonio N. inference.py

Edited Mar 06, 2022 by Antonio Núñez