Switch to UrlValidator for Identifer URL validation
What does this MR do and why?
In !127352 (merged), we started using AddressableUrlValidator to validate Identifier URLs.
AddressableUrlValidator makes a getaddrinfo syscall when validating the URL. This is a problem for the Vulnerabilities::Identifier model because we create them in bulk, and the syscall makes a DNS query over network for non-local URLs. This means that bulk inserts might need to wait for hundreds of network connections to finish!
To fix this, switch to using UrlValidator instead, which validates URLs using regex only. It is safe to validate these URLs with regex because we do not make server-side requests to them. They are displayed in the vulnerability details as clickable links.
Relates to:
How to set up and validate locally
Numbered steps to set up and validate the change are strongly suggested.
-
Start rails console:
bundle exec rails c
-
Run this code:
puts "start: #{Time.zone.now}" 300.times { |i| Vulnerabilities::Identifier.new(url: "https://security#{i}.example.com").valid? } puts "end: #{Time.zone.now}"
Before: ~40 seconds (timing varies based on network latency)
start: 2023-11-02 19:07:41 UTC
end: 2023-11-02 19:08:20 UTC
After: < 1 second
start: 2023-11-02 19:10:10 UTC
end: 2023-11-02 19:10:10 UTC
MR acceptance checklist
This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.
-
I have evaluated the MR acceptance checklist for this MR.