Skip to content

[ENH,FIX] JSONL export and categorylib SQL rework

Benoit Grégoire requested to merge benoitg/tiki:ai_indexing into master
  • [ENH,FIX] JSONL export and categorylib SQL rework
  • Add objects::export console command to export objects (currently tracker items and wiki pages are supported) to jsonl format typically used in machine learning systems.
  • Fix search tag striping in WikiText.php to avoid merging and cutting words.
  • Search_Type_Factory_Direct: Revert a8faa104, it was a performance optimisation, but made the reference implementation of the type factory behave very differently for wikitext. Some potential for regression if code rely of receiving the original tags.
  • Rework the sql queries in categlib.php to allow further generalisation and abstraction. Fixes some duplicated values bugs, but has regression potential if some row aggregate columns were relied upon by code using the raw rows.
Edited by Benoit Grégoire

Merge request reports