In order to support integrations in Cells architecture we need to split existing integrations into group/project integrations (existing integrations table) and instance wide integrations. Instance wide integrations need a separate table, which will be done in #474808 (closed)
However, to support the new table we also need to define a new InstanceIntegration model, which is identical to Integration model, with the exception of group/project association.
Proposed solution
Create a new InstanceIntegration model which is identical to InstanceIntegration model.
Perhaps this and #474808 (closed) issues need to be combined in one in order to ensure nothing breaks.
Additionally, create corresponding instance models for existing integrations (e.g. Integrations::Telegram should have a corresponding Integrations::Instance::Telegram). This is needed because instance_integrations has type STI column.
@.luke may I ask for your suggestions on this issue
As you know, we now have instance_integrations table, which is a stripped down version of integrations (a number of columns is not included).
I picked this issue up and wondering what the best approach for adding new models is. I currently see two approaches:
New instance integration models inherit from non-instance ones.
Tricky, since non-instance ones inherit from Integration model which has project/group validations, it's own set of STI rules, and other underwater rocks I am yet to discover.
New instance integration models inherit from new InstanceIntegration model
This is a more straightforward approach, but has it's own issues. Specifically:
We'd have to duplicate logic to the new models. E.g. InstanceIntegration model should have everything except validations around project/group.
One model duplication might be ok, but it seems like all other itegration models have to have duplicated code as well. Since if we don't we need to inhering from non-instance models, which inherit from Integration. E.g. Integrations::Instance::Telegram < Integrations::Telegram is not compatible, since Integrations::Telegram inherits from Integrations.
Duplicating logic across 2 types of integrations will bring us maintenance challenges, where we have to add new behavior or fix bugs in two places.
WDYT? Do you have suggestions on how to approach it best? Perhaps there's another approach I am missing.
@georgekoltsov Could we have a shared module for all the integration behaviours common to project, group and instance integrations? And then project/group-specific behaviours can be in Integration, with instance-specific behaviours in Integrations::InstanceIntegration.
@.luke We still have a problem with all of the existing integrations inheriting from Integration, though.
Lets say we:
Create BaseIntegration class or module to mixin that includes all the base functionality of an integration
We inherit from BaseIntegration or include it in Integration/InstanceIntegration classes
We still have all the integrations inheriting from Integration, so we need to split them to have both Instance and non instance integrations
In which case we need to have a new model of each integration (e.g. Integrations::Telegram < Integration and Integrations::Instance::Telegram < InstanceIntegration)
We also have a bunch of base classes (like Integrations::BaseCi) that inherit from Integration. I am guessing such classes would have to inherit from BaseIntegration now as well. Not sure if we'd need separate classes for instance integrations, e.g. Integrations::Instance::BaseCi
In this case we still have a situation where if we were to update an integrations, we'd have to update it in both models (e.g. Integrations::Telegram & Integrations::Instance::Telegram), unless we define base models for each legit integration, e.g. Integrations::Base::Telegram.
@georgekoltsov Hmm. One method might be to try to make minimal changes, and have instance integrations inherit from their non-instance counterparts. Which now I see is what Kras suggested here too &14638 (comment 2034090042).
In reality it's less "out-of-the-box" as it would normally be because of the schema differences between integrations and instance_integrations, which might make a case for making their schemas more identical after all .
With the below diff (which involves some "hacks" like switching a validation off, and a property alias that I'm unsure if we actually need - which are required due to the schema differences between the two tables), we get something that appears like it could work.
We can avoid the "hacks" if we need to, with a more similar schema between the tables (have a type_new instead of type and have a redundant boolean instance which would always be true for these records).
In this approach, we'd continue to use instance_level? as a check to switch off non-instance behaviours. It's perhaps not ideal, but might be the path of least resistance?
It appears on a basic level at least to work:
Integrations::Instance::Discord.count# => SELECT COUNT(*) FROM "instance_integrations" WHERE "instance_integrations"."type" = 'Integrations::Instance::Discord'Integrations::Instance::Discord.new.type# => "Integrations::Instance::Discord"integration=Integrations::Instance::Discord.newintegration.valid?# => true
Appreciate your feedback @.luke I agree this approach looks much cleaner. I made changes as per your suggestions in !168953 (closed) and it looks promising.
@krasio@georgekoltsov instance-level integrations are not permitted on GitLab.com, and integrations has 0 integrations where instance=truehttps://console.postgres.ai/gitlab/gitlab-production-main/sessions/32371/commands/99975. This will be the case in the future too. Cells architecture is only for GitLab.com, is that correct? The work to split the integrations into integrations and instance_integrations is quite a lot of work, and quite exact and risky. Refactoring the model code itself will be tricky. Do we have any ability to say that instance integrations will never be part of Cells architecture, and so skip needing to split the table up? Is the requirement driving the split just the sharding key on the integrations table? Can we mitigate that somehow? It seems like we're undertaking a huge amount of work to essentially make integrations table compatible with data it will never have on GitLab.com.
I am not opposed given there are zero rows where instance is TRUE
@tkuah What is the problem if there are rows with instance IS TRUE and now value for the sharding key? I feel like I knew this but can't recall right now.
@tkuah@krasio@georgekoltsov Ah, okay. If we need to support the possibility of Cells for self-managed, then we would expect self-managed to have instance integration data in integrations alongside data specific to groups and projects, which I think means we need to continue with this work. Thanks for fact-checking this
@wortschi finding a suitable solution took longer than expected to begin with, since I need to do more db changes to the new table, so the initial weight should probably be upped to 3. That said, in !168953 (closed) we need to fix one outstanding issue and go through review process. So in terms of weight I would say finding the fix for the issue is weight 1 (although I am currently struggling to find a suitable solution) and then implement any reviewer feedback is also 1. So this issue definitely requires a bit of effort in %17.6
@wortschi JFYI, we believe !168953 (closed) is not the way to go because of this issue and I opened another approach draft MR to get some feedback from Luke !169768 (closed) it looks promising, but more changes (likely multiple MRs since we have ~40+ integrations) will be required to move the code to be shared between instance/non-instance integrations.
@georgekoltsov Thank you for the update. Given the new insights, do you think all the issues in &14638 need to be updated, i.e., do we expect all of them to be needed and is there additional work required?
Also, do we know if the new approach will also be applicable to webhooks tables?
Thank you for the update. Given the new insights, do you think all the issues in &14638 need to be updated, i.e., do we expect all of them to be needed and is there additional work required?
I think the same tasks need to be done on a highlevel, just the implementation is different to reduce the amount of potential future problems. The additional work required as it currently stands is making adjustments to the existing table and splitting models in a new way (if we don't find new road blocks along the way).
Also, do we know if the new approach will also be applicable to webhooks tables?
I believe the new approach is going to work for webhooks as well