Set up cronjob to clean up expired placeholder references

Everyone can contribute. Help move this issue forward while earning points, leveling up and collecting rewards.

Background

As part of the data retention policy for user contribution mapping tables &17248, group owners have a 1-year window to perform user reassignments. After this period expires, contributions remain assigned to placeholder users if no action is taken.

Description

We need to implement an automated cleanup process to remove placeholder and expired membership references that have passed their expiration date.

Technical Details

  • Create a new worker: CleanupExpiredPlaceholderReferencesWorker
  • Schedule to run every 6 hours using cron expression: 0 */6 * * *
  • Worker should:
    • Remove expired records from import_source_user_placeholder_references table
    • Remove expired records from import_placeholder_memberships table
    • Implement loop-based batching with 5,000 records per batch
    • Delete records where expires_at <= NOW()

Acceptance Criteria

  • Worker is implemented and properly scheduled
  • Expired records are successfully removed from both tables
  • Batching is implemented to handle large datasets efficiently
Edited by 🤖 GitLab Bot 🤖