Support training on KCIDB data

Support consuming KCIDB data with the format outlined during work on #20 (closed), along with downloaded files in ZIP files (similarly to how it's done currently) to train Kwai models. We'll need a similar memory representation to what is being used now for triaging (the "pool"). If that becomes too cumbersome we might go with an in-memory sqlite database instead (gathering and massaging data for triaging results is already hairy).

Jira: CKI-6406

Jira: CKI-7460

Edited by CKI Bot