Intraprocess multimachine sync'd with PTP still gives >5s latency

Hi!

Thanks for this great project! We've been experimenting in our lab with PTP and ROS2, and wanted to run some benchmarks for various packet sizes. I've just started playing around with this project and wanted to see whether the intraprocess example you give might work between two networks if the time sync is accurate enough!

With PTP, I can confirm that both PC clocks are synchronized to < 1 microsecond

However when I run the test, I seem to be reporting 5+ seconds of latency between the two machines.

Machine 1: ./install/performance_test/lib/performance_test/perf_test -c rclcpp-single-threaded-executor --msg Array128 -r 1 -s 1 -p 0 --max-runtime 120 -l log_Array128_1hz_120s_multimachine_sub.json

Machine 2: ./install/performance_test/lib/performance_test/perf_test -c rclcpp-single-threaded-executor --msg Array128 -p 1 -s 0 --max-runtime 120 -r 1 -l log_Array128_1hz_120s_multimachine_pub.json

Using the notebook to plot the latencies (minus the spurios beginning and end) I get: newplot_2_

I was wondering if you might be able to shed some light on what might be happening? No matter what array size or rate I give it the latency remains about the same.

Any thoughts on what might be happening?