Add caching for Elastic::DataMigrationService.migration_has_finished?(name)
The following discussion from !46672 (merged) should be addressed:
-
@DylanGriffith started a discussion: (+3 comments) I think we may wish to introduce a concept where a migration has a name and there is some static method that says:
migration_has_finished?(:create_ssue_access_level_field)
This will be useful in our next migration because we'll have to decide whether or not to include the access level when sending the issue to Elasticsearch.
Something like:
data[attr.to_s] = safely_read_attribute_for_elasticsearch(attr) end + data['access_level'] = safely_read_attribute_for_elasticsearch(project.issues_access_level) if Elastic::MigrationRecord.migration_has_run?(:create_ssue_access_level_field) + data['assignee_id'] = safely_read_attribute_for_elasticsearch(:assignee_ids) data.merge(generic_attributes)
My thinking is that we'll need to check this every time we send some data to Elasticsearch and as such we'll need caching and as such we'll need something that busts the cache.
Probably we would load all
.persisted_versions
(like you have below) and cache those in Redis. Whenever the migration worker finishes a migration it can bust this cache.
Dylan Griffith Dylan Griffith @DylanGriffith · 1 day ago Maintainer Possibly (like we found last time we worked on Redis caching) it may be more efficient to store each migration as a single boolean that can be checked in the redis cache rather than marshalling the Array back and forth to Redis on every request as there will be thousands per second possibly.
Then again I'm struggling to think exactly how to do this in Redis. I think we'll run into the same problem we had last time in which we need to distinguish the different states of when the cache is not primed and when the item is missing from the cache so we probably can't use Redis sets for that and may need to use Redis hashes and cache the explicit true/false values so we can distinguish the case where the cache doesn't exist yet.
Dylan Griffith Dylan Griffith @DylanGriffith · 1 day ago Maintainer We can potentially save this concept for a later MR I guess. We just need to make sure this current approach will allow us to do this.
Dmitry Gruzd
Dmitry Gruzd @dgruzd · 14 hours ago
Developer
@DylanGriffith I created a new method Elastic::DataMigrationService.migration_has_finished?(name)
I believe that it makes sense to add the caching later when we actually need it. WDYT?