feat(item-identifiers): removed the concept of structured identifiers, replaced with URI

also made performance improvements around buffer processing

BREAKING CHANGE: CR-1068

Replace structured identifiers with URIs.

Identifier class

  • change the structure from 'tokenized' and 'suffix' to a single URI.
  • parse from string
  • added 'persistent' flag for identifier types. Only applies to DOI so far, will need follow up work for others, e.g. ROR, ORCID.

Item class

  • added 'blank' field to an identifier directly
  • changed assertion buffer processors to use this directly, removed a JOIN

Identifier table

  • refactored identifiers buffer and identifiers table
  • store all parts of the URI as fields
  • faked 'null' with special character

Tokenized / Suffix to URI

  • change the structure from tokenized 'prefix' and 'suffix' to a single URI.
  • all code that looked for a 'prefix' now just uses the domain
  • remove the identifier type registry - code, tables, JSON Schema file

URI parsing / normalization

  • special treatment for DOIs
  • DOIs are stored as lower case
  • altered all tests that expect DOIs to be returned upper case

Buffer Batch

  • this was causing race conditions and occasional transaction deadlocks in tests
  • changed buffer batch to use a UUID not a primary key
  • removed the buffer batch count. Instead, run a query on the UUID to see if the batch is complete.

Renderer

  • stop caching the item pks for authorities and authority roots in the RenderQueueProcessor
  • remove the concept of stale. Just a queue of things to re-render.
  • SQS was making tests unreliable.
  • Was also causing synchronisation problems between the 'stale' table, the queue, and the items table.
  • Was also causing table locks.
  • moved rendering from SQS to a SQL based queue table
Edited by Joe Wass

Merge request reports

Loading