Consolidate data inventories and catalogues into single workbook
We should consolidate all of our disparate data catalogues, inventories, and trackers into a single Excel sheet. I've created a template of what should be included:
And I'm working to consolidate the following sheets:
warehouse_athena_map.xlsx data_catalog_wiki.xlsx data_catalog_warehouse.xlsx update_inventory.xlsx
The final workbook should:
- Live at
Data/Data_Dept_Catalog.xlsxin this repo - Be linked to from the Home and _sidebar wiki pages + from a readme note in the data architecture repo
- Be tracked using Git LFS
- Orange columns in the worksheet should be updated programmatically via daily API calls to AWS. Can use GitLab's CI + boto3 to accomplish this
- Be machine-readable in the long format, no merged cells!