Discussion around Packages' Single Table Inheritance structure.
Background
Per the GitLab development docs, polymorphic relationships are preferred to single table inheritance (STI). Currently The main package model https://gitlab.com/gitlab-org/gitlab/blob/master/ee/app/models/packages/package.rb Packages::Package
uses a package_type
column to identify npm
vs maven
. Continuing down this path, this column may identify up to a dozen package managers over the next few years. As I've been implementing the Conan package manager, new behavior has presented itself in a way that warrants extracting the subclass. I found myself writing many methods in Packages::Package
that use return unless conan?
. This led to specs that included context "is conan package" and "is not conan package".
Goal of this issue
To determine existing intentions in the code design and decide any other paths to consider as we expand our range into multiple package managers. My main concern is the docs specifically warn against STI, yet it has been chosen for this area of the code. I'd like to understand if there was a specific reason behind this decision that differs from my own thoughts below.
My initial thoughts
Stick with STI and use self.inheritance_column = 'package_type'
in Packages::Package
, and create subclasses for specific package managers only when the package specific behavior identifies itself.
- Pros:
- This requires no database changes
- If we moved to polymorphic associations, it would be a 'messy' and risky refactor.
- Inheritance will make sense with both Package and PackageFile models
- Cons:
- Some confusion is created when there is also a class for the metadata. So there could be a
ConanMetadatum
class that has it's own table for storing conan specific data, but there is also aConan
package subclass, which does not refer to any table, but contains specific behavior to conan packages. They will have a one-to-one relationship.
- Some confusion is created when there is also a class for the metadata. So there could be a
I considered the idea of switching to a polymorphic structure, but the refactor is a bit messy, and then when we want to fetch groups of packages from all package managers in an aggregated way (all packages for a given group), this structure complicates that query and starts to necessitate a special service or abstract class to handle this aggregation logic.