Possible Geo node status performance regression

Noted on staging.gitlab.com / gstg.gitlab.com

Since the upgrade to 11.0.0rc2 from 10.8.3, geo node status has started timing out. It runs for approx. 90 seconds before the lfs_objects count query fails.

I'm not sure at present how long the lfs_objects count query is taking, just that it's using the legacy query when it times out:

irb(main):001:0> t1 = Time.now ;s =  begin ; GeoNodeStatus.current_node_status ; rescue => err ;err ;  end ; t2 = Time.now ; t2 - t1
=> 97.33607666
irb(main):002:0> s
=> #<ActiveRecord::StatementInvalid: PG::QueryCanceled: ERROR:  canceling statement due to statement timeout
: SELECT COUNT(*) FROM "lfs_objects" INNER JOIN
(VALUES (1),...(n))
registry(id)
ON lfs_objects.id = registry.id WHERE ("lfs_objects"."file_store" = 1 OR "lfs_objects"."file_store" IS NULL)>

irb(main):003:0> Gitlab::Geo::Fdw.enabled?
=> true

This completely breaks receiving status updates. It's possible this isn't a regression at all, merely an organic increase in how long this individual query takes on staging, but we don't see much action on staging, so I suspect it may be a regression.

Assignee Loading
Time tracking Loading