Read clustered recs from Vitess
Goal
We can't exclude enough seen posts with the current ElasticSearch solution. #2363 (closed) will enable us do a JOIN
query in order to exclude seen results much easier, and with much less bandwidth usage.
What needs to be done
Read clustered results from MySQL.
CREATE TABLE clustered_entities_recs (
cluster_id int,
entity_guid bigit,
score float,
...
PRIMARY KEY (cluster_id, entity_guid)
)
ES is currently:
{
'entity_guid': record[0],
'entity_owner_guid': rprops[0],
'cluster_id': record[1],
'score': record[2],
'total_views': rstats[2],
'total_engagement': rstats[3],
'@first_engaged': rstats[0].strftime("%Y-%m-%dT%H:%M:%SZ"),
'@last_engaged': rstats[1].strftime("%Y-%m-%dT%H:%M:%SZ"),
'@last_updated': update_str,
'@time_created': rprops[1].strftime("%Y-%m-%dT%H:%M:%SZ")
}
QA
Ensure results are being returned.
UX/Design
N/A
Personas
N/A
Experiments
No, but do make a feature flag
Acceptance Criteria
-
Schema for MySQL -
Querying results from Vitess and removing ElasticSearch -
Spec tests
Definition of Ready Checklist
-
Definition Of Done (DoD) -
Acceptance criteria -
Weighted -
QA -
UX/Design -
Personas -
Experiments
Read