Make IDs for problem, dataset and pipelines (and pipeline runs) be deterministically generated based on content

Not sure if all of them, but it seems it would be much better if IDs would be based on documents content instead of manually picked. In this way it would be easy to do basic deduplication and also make sure IDs are not reused/conflicting.