Fix title records in the DB that were duplicated as a result of the logic in content registration
Book and conference-level title records were being duplicated when there were simultaneous deposits, but this was fixed in #984 (closed)
Research task #1326 (closed) uncovered 10,579 ISBNs with multiple citation records (CiteIDs) that need to be fixed. The attached file contains these ISBNs and their CiteIDs
Based on the original issue, it would be best to build a CS tool (or kotlin, python) and use the provided list to look for the oldest CiteIDs for each isbn and remove the newer entries from the title_map_id.
List (pisbn/eisbn, alias/forced = null): issns_with_duplicate_citation_ids.json.zip
How urgent
Moderately urgent. IEEE will have additional examples; as will other members.
Definition of ready
-
Product owner: @SaraBowman -
Tech lead: @myalter -
Service:: or C:: label applied -
Definition of done updated -
Acceptance testing plan: edsbak2 test run -
Weight applied
Definition of done
-
Unit tests identified, implemented, and passing -
Code reviewed -
Available for acceptance testing via a staging URL, or otherwise -
Consider any impacts to current or future architecture/infrastructure, and update specifications and documentation as needed -
Knowledge base reviewed and updated -
Acceptance criteria met -
Write tool to go through a list of ISBNs, find the oldest CiteID for each, and remove the newer entries from the title_map_id (leave only the oldest CiteID) -
All duplicate citations for title isbn's in the title_map_id, for IDTYPE 3 and 4 that are not aliased (including forced) have been removed from the title_map_id table.
-
-
Acceptance testing passed
Related to #1326 (closed).