Validate vulnerability_reads schema when vulnerabilities and vulnerability_occurrences schema are updated
Based on the discussion #335326 (comment 780709997) we need to implement a validator/checker to check the schema of vulnerability_reads
every time vulnerabilities
and vulnerability_occurrences
schema are updated.
Proposal: #335326 (comment 784116408)
/cc @minac
Why are we doing this work
To improve the performance of the "Vulnerability Report" page, we've decided to implement "Command Query Responsibility Segregation"(CQRS) by creating a "read table" with the denormalized columns of the "vulnerabilities" and "vulnerability_occurrences" tables which are subject to filtering and sorting.
Since we will have a new table, we need to find a way to keep the schema of this new table up-to-date with its source tables(vulnerabilities
and vulnerability_occurrences
).
Implementation Plan
-
documentation Create a new doc page to explain the CQRS approach -
backend Implement the "schema check" mechanism as follows to inform the feature developers whenever the schema of the vulnerabilities
andvulnerability_occurrences
tables change;-
Implement a module to generate a checksum value for the Active Record model schemas(1). -
Include this new module in the Vulnerability
andVulnerabilities::Finding
models and setschema_version
s(2). -
Write a new spec to check if the expected schema version matches with the actual schema on the database.
-
Technical Details/Pseudo Implementation
Here you can find the non-complete implementations for the above plan to have a sense of what needs to be done.
(1) The module to calculate and lock the schema version for a model
module LockedSchema
attr_accessor :locked_schema_version
def self.included(base)
Registry.register(base)
end
def validate_schema_version!
return true if locked_schema_version == current_schema_version
raise "Seems like you've added a new column or index to #{table_name}. \n" \
"Did you consider updating the `vulnerability_reads` table as well?\n" \
"If the changes you've introduced is not relevant with the `vulnerability_reads` table," \
"please update the `locked_schema_version` with `#{current_schema_version}`. \n" \
"If the changes are also important for the `vulnerability_reads` table, \n" \
"please update the schema and triggers and set the `locked_schema_version`."
end
def current_schema_version
(_column_names + _index_names).join('-')
.then { |s| Digest::SHA256.hexdigest(s) }
end
private
def _column_names
connection.columns(table_name).map(&:name)
end
def _index_names
connection.indexes(table_name).map(&:name)
end
class Registry
def self.models
@models ||= []
end
def self.register(model)
models << model
end
end
end
(2) Usage of this module in a model
class Vulnerability
include LockedSchema
self.locked_schema_version = "976edf9df7c67f945572d0758b42d8e07ac41dca02defb4a261d96f2e4507e20"
end
(3) An example spec
# in a new spec file
RSpec.describe "Locked model schemas" do
LockedSchema::Registry.models.each do |model|
it "checks the schema of the #{model.name}" do
expect(model.validate_schema_version!).to be_truthy
end
end
end
Note: The above one is just a pseudo implementation that is far away from being complete. Even the syntax hasn't been checked therefore please take it just as a narrative which gives the reader the sense of what needs to be done.