Skip to content
GitLab
    • GitLab: the DevOps platform
    • Explore GitLab
    • Install GitLab
    • How GitLab compares
    • Get started
    • GitLab docs
    • GitLab Learn
  • Pricing
  • Talk to an expert
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
    • Switch to GitLab Next
    Projects Groups Topics Snippets
  • Register
  • Sign in
  • Minds Backend - Engine Minds Backend - Engine
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributor statistics
    • Graph
    • Compare revisions
    • Locked files
  • Issues 282
    • Issues 282
    • List
    • Boards
    • Service Desk
    • Milestones
    • Iterations
    • Requirements
  • Merge requests 16
    • Merge requests 16
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Packages and registries
    • Packages and registries
    • Package Registry
    • Container Registry
    • Infrastructure Registry
  • Monitor
    • Monitor
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • Code review
    • Insights
    • Issue
    • Repository
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Commits
  • Issue Boards
Collapse sidebar
  • MindsMinds
  • Minds Backend - EngineMinds Backend - Engine
  • Issues
  • #2364
Closed
Open
Issue created Jul 15, 2022 by Mark Harding@markhardingOwner

Read clustered recs from Vitess

Goal

We can't exclude enough seen posts with the current ElasticSearch solution. #2363 (closed) will enable us do a JOIN query in order to exclude seen results much easier, and with much less bandwidth usage.

What needs to be done

Read clustered results from MySQL.

CREATE TABLE clustered_entities_recs (
  cluster_id int,
  entity_guid bigit,
  score float,
  ...
  PRIMARY KEY (cluster_id, entity_guid)
)

ES is currently:

{
  'entity_guid': record[0],
  'entity_owner_guid': rprops[0],
  'cluster_id': record[1],
  'score': record[2],
  'total_views': rstats[2],
  'total_engagement': rstats[3],
  '@first_engaged': rstats[0].strftime("%Y-%m-%dT%H:%M:%SZ"),
  '@last_engaged': rstats[1].strftime("%Y-%m-%dT%H:%M:%SZ"),
  '@last_updated': update_str,
  '@time_created': rprops[1].strftime("%Y-%m-%dT%H:%M:%SZ")
}

QA

Ensure results are being returned.

UX/Design

N/A

Personas

N/A

Experiments

No, but do make a feature flag

Acceptance Criteria

  • Schema for MySQL
  • Querying results from Vitess and removing ElasticSearch
  • Spec tests

Definition of Ready Checklist

  • Definition Of Done (DoD)
  • Acceptance criteria
  • Weighted
  • QA
  • UX/Design
  • Personas
  • Experiments

Read

Edited Jul 27, 2022 by Fausto Arcidiacono
Assignee
Assign to
Time tracking