Skip to content
Snippets Groups Projects

Remove updated_at column from audit_events table

Merged Tan Le requested to merge 217941-drop-updated-at-on-audit-events into master
1 unresolved thread

What does this MR do?

Remove updated_at column from audit_events table.

Followed development guideline on dropping column.

  • Step 1: Ignoring the column (release 13.1) (!34156 (merged))
  • Step 2: Dropping the column (release 13.2) (this MR)

Relates to #217941 (closed)

Database log

== 20200724100421 RemoveUpdatedAtFromAuditEvents: reverting ===================
-- add_column("audit_events", :updated_at, :datetime)
   -> 0.0012s
-- add_column("audit_events_part_5fc467ac26", :updated_at, :datetime)
   -> 0.0016s
-- execute("CREATE OR REPLACE FUNCTION table_sync_function_2be879775d()\nRETURNS TRIGGER AS\n$$\nBEGIN\nIF (TG_OP = 'DELETE') THEN\n  DELETE FROM audit_events_part_5fc467ac26 where id = OLD.id;\nELSIF (TG_OP = 'UPDATE') THEN\n  UPDATE audit_events_part_5fc467ac26\n  SET author_id = NEW.author_id,\n    type = NEW.type,\n    entity_id = NEW.entity_id,\n    entity_type = NEW.entity_type,\n    details = NEW.details,\n    updated_at = NEW.updated_at,\n    ip_address = NEW.ip_address,\n    author_name = NEW.author_name,\n    entity_path = NEW.entity_path,\n    target_details = NEW.target_details,\n    created_at = NEW.created_at\n  WHERE audit_events_part_5fc467ac26.id = NEW.id;\nELSIF (TG_OP = 'INSERT') THEN\n  INSERT INTO audit_events_part_5fc467ac26 (id,\n    author_id,\n    type,\n    entity_id,\n    entity_type,\n    details,\n    updated_at,\n    ip_address,\n    author_name,\n    entity_path,\n    target_details,\n    created_at)\n  VALUES (NEW.id,\n    NEW.author_id,\n    NEW.type,\n    NEW.entity_id,\n    NEW.entity_type,\n    NEW.details,\n    NEW.updated_at,\n    NEW.ip_address,\n    NEW.author_name,\n    NEW.entity_path,\n    NEW.target_details,\n    NEW.created_at);\nEND IF;\nRETURN NULL;\n\nEND\n$$ LANGUAGE PLPGSQL\n")
   -> 0.0022s
== 20200724100421 RemoveUpdatedAtFromAuditEvents: reverted (0.0120s) ==========

== 20200724100421 RemoveUpdatedAtFromAuditEvents: migrating ===================
-- remove_column("audit_events", :updated_at)
   -> 0.0011s
-- execute("CREATE OR REPLACE FUNCTION table_sync_function_2be879775d()\nRETURNS TRIGGER AS\n$$\nBEGIN\nIF (TG_OP = 'DELETE') THEN\n  DELETE FROM audit_events_part_5fc467ac26 where id = OLD.id;\nELSIF (TG_OP = 'UPDATE') THEN\n  UPDATE audit_events_part_5fc467ac26\n  SET author_id = NEW.author_id,\n    type = NEW.type,\n    entity_id = NEW.entity_id,\n    entity_type = NEW.entity_type,\n    details = NEW.details,\n    ip_address = NEW.ip_address,\n    author_name = NEW.author_name,\n    entity_path = NEW.entity_path,\n    target_details = NEW.target_details,\n    created_at = NEW.created_at\n  WHERE audit_events_part_5fc467ac26.id = NEW.id;\nELSIF (TG_OP = 'INSERT') THEN\n  INSERT INTO audit_events_part_5fc467ac26 (id,\n    author_id,\n    type,\n    entity_id,\n    entity_type,\n    details,\n    ip_address,\n    author_name,\n    entity_path,\n    target_details,\n    created_at)\n  VALUES (NEW.id,\n    NEW.author_id,\n    NEW.type,\n    NEW.entity_id,\n    NEW.entity_type,\n    NEW.details,\n    NEW.ip_address,\n    NEW.author_name,\n    NEW.entity_path,\n    NEW.target_details,\n    NEW.created_at);\nEND IF;\nRETURN NULL;\n\nEND\n$$ LANGUAGE PLPGSQL\n")
   -> 0.0027s
-- remove_column("audit_events_part_5fc467ac26", :updated_at)
   -> 0.0030s
== 20200724100421 RemoveUpdatedAtFromAuditEvents: migrated (0.0101s) ==========

Does this MR meet the acceptance criteria?

Conformity

Availability and Testing

Security

If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:

  • [-] Label as security and @ mention @gitlab-com/gl-security/appsec
  • [-] The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
  • [-] Security reports checked/validated by a reviewer from the AppSec team
Edited by Tan Le

Merge request reports

Merged results pipeline #172946995 passed

Merged results pipeline passed for c932b4a9

Test coverage 56.62% from 2 jobs
Approved by

Merged by Sean McGivernSean McGivern 4 years ago (Jul 31, 2020 12:48pm UTC)

Merge details

  • Changes merged into master with 5006578a.
  • Deleted the source branch.
  • Auto-merge enabled

Pipeline #172966553 passed

Pipeline passed for 5006578a on master

Test coverage 56.62% from 2 jobs
5 environments impacted.

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
  • Tan Le added 458 commits

    added 458 commits

    Compare with previous version

  • Tan Le added 10 commits

    added 10 commits

    Compare with previous version

  • Tan Le resolved all threads

    resolved all threads

  • Tan Le mentioned in epic &2765

    mentioned in epic &2765

  • Tan Le changed the description

    changed the description

  • added typemaintenance label and removed backstage [DEPRECATED] label

  • Tan Le removed backend label

    removed backend label

  • Tan Le added 1400 commits

    added 1400 commits

    Compare with previous version

  • Andreas Brandl approved this merge request

    approved this merge request

  • Andreas Brandl unapproved this merge request

    unapproved this merge request

  • Tan Le added 2504 commits

    added 2504 commits

    Compare with previous version

  • Tan Le added 1 commit

    added 1 commit

    • 3bc77e08 - Remove updated_at column from audit_events table

    Compare with previous version

  • Tan Le added 203 commits

    added 203 commits

    Compare with previous version

  • Tan Le marked the checklist item Changelog entry as completed

    marked the checklist item Changelog entry as completed

  • Tan Le added 1 commit

    added 1 commit

    • a91c7f99 - Remove updated_at column from audit_events table

    Compare with previous version

    • Author Contributor

      :wave_tone2: @iroussos there might be still failures on the pipeline but I'd like to seek early feedback from you in regards to the new approach with audit_events partition setup. Could you please kindly have a look? :pray_tone2:

    • Thank you @tancnle, I have left two small comments.

      As @abrandl wrote, that's the first time we are trying to do this, so I am trying to be as careful as possible and cover all possible cases. Check my comments and tell me if you disagree!

      I also hope that we will not have any issues with acquiring the required ACCESS EXCLUSIVE lock on audit_events on a timely fashion, but let's make sure that we'll notify the release manager once this is ready to be deployed. I think that having @mayra-cabrera double checking this as the database maintainer is a nice idea (she is already picked by Danger either way).

      With respect to the failing tests: I think that the three audit_event.update fail cause ActiveRecord has the old schema information loaded before the columns are dropped. There is a fourth audit_event.update in the same spec that I think that it does not fail due to audit_event.reload being called.

    • Author Contributor

      Thanks for your due diligence and care @iroussos. Really appreciated :pray_tone2: I have updated the migration order and also added some specs on AuditEventPartitioned.

      These tests will make sure that we have the same set of columns on both tables as well as insert trigger behaves as expected. We do not typically do delete and update for audit events so I have skipped those. Please let me know what you think.

      Edited by Tan Le
    • Thank you @tancnle the migrations and the spec look good to me!

      You can re-assign the MR back to me when the final rebase is done and I'll move it for database maintainer review after a final check.

    • Author Contributor

      @iroussos The pipeline is :white_check_mark: Could you have a look? :pray_tone2:

      Edited by Tan Le
    • Thank you @tancnle, everything looks good to me!

      @mayra-cabrera do you mind doing the database maintainer review?

    • Thanks @tancnle! I've left a couple of questions from database.

    • Author Contributor

      Thanks for reviewing @mayra-cabrera. I have responded to your questions. :ping_pong:

      Could you help reviewing the specs from backend perspective as well? :pray_tone2:

    • Thanks @tancnle! LGTM from database side.

      Could you help reviewing the specs from backend perspective as well?

      It may be helpful to have another maintainer eyes before merging, I'm assigning the reviewer roulette suggestion. @smcgivern Could you please take a look?

    • Sure! My understanding is that this is tricky because the existing trigger includes updated_at, which is why this MR removes the column from the source table, then updates the trigger to not use updated_at, then removes the column from the partitioned table.

      I checked the current value of table_sync_function_2be879775d in production and that all looks good. I do have a couple of questions for @iroussos and @tancnle:

      1. Could we update the trigger function first in the up step?
      2. If we could, could we split this into two separate migrations? Updating the trigger should be safe but dropping the columns is the scary part, right?

      I'm not saying we should do those things, I just want to understand this, so it might be that no code changes are needed.

    • Could we update the trigger function first in the up step?

      @smcgivern yes in this case we are (1) dropping a column that (2) has no constraints on it (e.g. a NOT NULL)

      So we could safely drop the trigger first, even in a separate migration (i.e. not the same transaction) --> it will just stop copying data that we will either way drop afterwards. And there is no constraint to complaint about the missing data.

      But this is not the case for other operations, like for example when adding a new column (check !36198 (merged) for that case): in that case we can not update the trigger out of the transaction that makes the schema change as, between transactions a new insert may come and the trigger will try to access a column that does not exist already.

      The operation in this MR is the simplest possible one, while the MR does not introduce any other application updates.

      Our plan is to run it first and use it to test that our process is correct. That's why we are doing everything in the same transaction, like we would do for most other operations, so that we can test that our approach has no flaws.

      Updating the trigger should be safe but dropping the columns is the scary part, right?

      Yes, the part that makes the schema updates could cause issues, mainly during the EXCLUSIVE LOCK acquisition phase: updates keep coming in the main table while it is also accessed by our background jobs that try to sync the main table with the partitions.

      I don't think that we will we have significant issues with audit_events being mildly popular and the sidekiq jobs having a delay between them, but it is better to be safe than sorry.

      Edited by Yannis Roussos
    • @iroussos thanks for the explanation! Just so I understand:

      1. In this specific case it would be OK to do what I suggested - update the trigger in one migration (or even one deploy), and drop the columns in another.
      2. For columns with contraints, or new columns, this isn't possible. For a new column obviously we can't update the trigger until the column is there, but they also can't be in separate transactions because then we'd lose some data before the trigger was updated.
      3. We're doing it this way in this MR because that covers the most general case, as a test run for some more complicated situations in future.

      Is that correct?

    • @smcgivern What you wrote is 100% correct (or at least on par with what my understanding also is :-) ).

      For a new column obviously we can't update the trigger until the column is there, but they also can't be in separate transactions because then we'd lose some data before the trigger was updated.

      Exactly, I forgot to also mention that last scenario! The second part explains why we don't also want the trigger to be updated after the schema update is done and the reason we want to have everything on the same transaction.

    • Thanks @iroussos! In that case I'm OK with this MR and will set MWPS.

      cc @mayra-cabrera @yorickpeterse as release managers.

    • This was successfully deployed to prod :rocket:

      cc @tancnle, @iroussos

    • Author Contributor

      Thanks a lot for rolling this out to prod @mayra-cabrera. :bow_tone1: :dancers:

    • I confirm that the update was successfully deployed to production with no issues encountered:

      1. The column was dropped from audit_events, the partitioned table and all the partitions
      2. The trigger was updated successfully
      3. All new audit events records for today (August 1st - 100K records at the time of writing) were properly synced to the partition for August (i.e. the trigger works without issues after the update)
      4. The background jobs for syncing the two tables keep on getting executed every 2 minutes and the partitions are filling up with data at the proper rate. (we are at 92% completion rate, up from 80.9% yesterday) (i.e. we did not brake the sidekiq jobs)
    • Please register or sign in to reply
  • assigned to @iroussos

  • Yannis Roussos
  • Yannis Roussos mentioned in merge request !36198 (merged)

    mentioned in merge request !36198 (merged)

  • Tan Le changed the description

    changed the description

  • Tan Le changed the description

    changed the description

  • Tan Le added 1 commit

    added 1 commit

    • ec4d966a - Remove updated_at column from audit_events table

    Compare with previous version

  • Tan Le added 1 commit

    added 1 commit

    • e6716085 - Remove updated_at column from audit_events table

    Compare with previous version

  • Tan Le added 1 commit

    added 1 commit

    • bb62df64 - Remove updated_at column from audit_events table

    Compare with previous version

  • assigned to @iroussos

  • Tan Le mentioned in merge request !38064 (merged)

    mentioned in merge request !38064 (merged)

  • Author Contributor

    @iroussos Heads up. I have move the factory extraction to FOSS to another MR (!38064 (merged)). Once it is merged, I will rebase with this branch and hopefully we can get a green pipeline. I will un-assign you for now.

  • unassigned @iroussos

  • mentioned in issue #231371 (closed)

  • Tan Le added 332 commits

    added 332 commits

    Compare with previous version

  • Tan Le added 1 commit

    added 1 commit

    • 93cfd52d - Remove updated_at column from audit_events table

    Compare with previous version

  • Tan Le added 13 commits

    added 13 commits

    Compare with previous version

  • Tan Le
  • assigned to @iroussos

  • Yannis Roussos approved this merge request

    approved this merge request

  • Mayra Cabrera
  • mentioned in issue #233046 (closed)

  • Mayra Cabrera approved this merge request

    approved this merge request

  • added databaseapproved label and removed databasereviewed label

  • assigned to @smcgivern and unassigned @mayra-cabrera

  • Tan Le added workflowin review label and removed workflowin dev label

    added workflowin review label and removed workflowin dev label

  • Sean McGivern resolved all threads

    resolved all threads

  • Sean McGivern approved this merge request

    approved this merge request

  • Sean McGivern assigned to @smcgivern and unassigned @tancnle

    assigned to @smcgivern and unassigned @tancnle

  • Sean McGivern enabled an automatic merge when the pipeline for c932b4a9 succeeds

    enabled an automatic merge when the pipeline for c932b4a9 succeeds

  • Max Woolf resolved all threads

    resolved all threads

  • merged

  • Sean McGivern mentioned in commit 5006578a

    mentioned in commit 5006578a

  • added workflowstaging label and removed workflowin review label

  • added workflowcanary label and removed workflowstaging label

  • added workflowproduction label and removed workflowcanary label

  • mentioned in issue #231314 (closed)

  • João Pereira mentioned in merge request !43703 (merged)

    mentioned in merge request !43703 (merged)

  • Yannis Roussos mentioned in issue #263135

    mentioned in issue #263135

  • Please register or sign in to reply
    Loading