Improve NTP detection using system defined one and fallback to the predefined otherwise in gitlab:geo:check
Everyone can contribute. Help move this issue forward while earning points, leveling up and collecting rewards.
Background
GitLab Geo requires synchronized clocks across all nodes to function properly. The gitlab:geo:check
rake task includes a health check that verifies NTP connectivity by attempting to reach pool.ntp.org
.
Problem
The current implementation only checks the hardcoded pool.ntp.org
server (or a user-defined NTP_HOST
environment variable), which causes false warnings in:
- Firewalled environments that block external NTP traffic
- Air-gapped installations with no internet access
- Corporate environments using internal NTP servers
These environments may have properly synchronized clocks via internal NTP servers defined in system configuration files, but the health check fails because it doesn't detect or use these configured servers.
Proposal
Implement a cascading NTP server detection that checks multiple sources before reporting a warning:
-
User-defined: Check
NTP_HOST
environment variable (already implemented) - System-defined: Parse system NTP configuration files to find locally configured servers (new)
-
Fallback: Use
pool.ntp.org
(already implemented) - Improved warning: If all checks fail, display an enhanced message. Look at this example.
This approach will reduce false positives while maintaining the ability to detect genuine clock synchronization issues.
Implementation Guide
-
Modify
ee/lib/geo/system_check/clocks_synchronization_check.rb
to check for system-defined NTP implementations. -
Full flow should be:
-
User-defined: Check
NTP_HOST
environment variable if set - System-defined: Check system NTP configs (provided below)
-
Fallback: Use
pool.ntp.org
- If all fail: Show improved warning message with JWT explanation
-
User-defined: Check
-
The 4 major implementations are ntpd, Chrony, systemd-timesyncd, and OpenNTPD; they cover practically all Unix-based operating systems.
-
The configuration files that should be looked for are:
configs = [ # Chrony configs (most common on modern Linux) ['/etc/chrony.conf', :chrony], ['/etc/chrony/chrony.conf', :chrony], ['/usr/local/etc/chrony.conf', :chrony], # systemd-timesyncd configs ['/etc/systemd/timesyncd.conf', :timesyncd], ['/etc/systemd/timesyncd.conf.d/*.conf', :timesyncd], ['/run/systemd/timesyncd.conf.d/*.conf', :timesyncd], # ntpd configs ['/etc/ntp.conf', :ntpd], ['/etc/inet/ntp.conf', :ntpd], ['/private/etc/ntp.conf', :ntpd], ['/usr/local/etc/ntp.conf', :ntpd], # OpenNTPD configs ['/etc/openntpd/ntpd.conf', :openntpd], ['/etc/ntpd.conf', :openntpd] ]
-
Each NTP implementation has a different structure so the server address will be in different fields.
- For Chrony and ntpd you're looking for the fields
server
andpool
. - For timesyncd,
NTP
andFallbackNTP
in the[Time]
section. - For OpenNTPD, the key is
servers
- For Chrony and ntpd you're looking for the fields
-
Here's a starter code to help you extract the right fields:
def parse_ntp_config(file_path, parser_type) content = File.read(file_path) servers = [] case parser_type when :chrony # Extract from: server <hostname/ip> and pool <hostname/ip> content.each_line do |line| line = line.sub(/#.*$/, '').strip if line =~ /^(?:server|pool)\s+([^\s]+)/ servers << $1 end end when :timesyncd # Extract from: NTP= and FallbackNTP= (space-separated, in [Time] section) in_time_section = false content.each_line do |line| line = line.strip # Track if we're in [Time] section if line.start_with?('[') in_time_section = (line == '[Time]') next end next unless in_time_section # Extract space-separated servers if line =~ /^(?:NTP|FallbackNTP)\s*=\s*(.+)$/ servers.concat($1.split(/\s+/)) end end when :ntpd # Extract from: server <hostname/ip> and pool <hostname/ip> content.each_line do |line| line = line.sub(/#.*$/, '').strip if line =~ /^(?:server|pool)\s+([^\s]+)/ server = $1 # Skip hardware reference clocks next if server.start_with?('127.127.') servers << server end end when :openntpd # Extract from: server <hostname/ip> and servers <hostname/ip> content.each_line do |line| line = line.sub(/#.*$/, '').strip if line =~ /^servers?\s+([^\s]+)/ servers << $1 end end end # Filter out localhost and empty values servers.reject { |s| s.empty? || s == '127.0.0.1' || s == '::1' || s == 'localhost' }.uniq end
-
Parsing rules:
- Strip comments (everything after
#
) - Skip localhost (
127.0.0.1
,::1
) - Extract just the hostname/IP, ignore options (
iburst
,prefer
, etc.) - Stop at the first config file containing valid servers
- Strip comments (everything after