Loading content/handbook/enterprise-data/platform/snowflake/clustering/index.md +30 −0 Original line number Diff line number Diff line Loading @@ -78,6 +78,36 @@ This function returns valuable information about the clustering state of your ta Ideally, `average_overlaps` would be below 1 and `average_depth` would be ~ 1. A high number indicates the table is not well clustered. ### Monitoring Clustering Costs To monitor the cost and activity of automatic clustering, query the `automatic_clustering_history` table, here's an example query: ```sql SELECT start_time, end_time, table_name, schema_name, database_name, credits_used, num_bytes_reclustered, num_rows_reclustered, DATEDIFF('minute', start_time, end_time) AS duration_minutes FROM snowflake.account_usage.automatic_clustering_history WHERE table_name = 'FCT_BEHAVIOR_STRUCTURED_EVENT' -- Replace with your table name AND schema_name = 'COMMON' -- Replace with your schema AND database_name = 'PROD' -- Replace with your database AND start_time >= '2025-12-17' -- When clustering was enabled ORDER BY start_time DESC; ``` Key Metrics: `credits_used`: Snowflake credits consumed by automatic clustering operations `num_bytes_reclustered`: Amount of data reorganized (in bytes) `num_rows_reclustered`: Number of rows reorganized `duration_minutes`: How long the clustering operation took ## Best Practices 1. Choose clustering keys wisely based on your query patterns Loading Loading
content/handbook/enterprise-data/platform/snowflake/clustering/index.md +30 −0 Original line number Diff line number Diff line Loading @@ -78,6 +78,36 @@ This function returns valuable information about the clustering state of your ta Ideally, `average_overlaps` would be below 1 and `average_depth` would be ~ 1. A high number indicates the table is not well clustered. ### Monitoring Clustering Costs To monitor the cost and activity of automatic clustering, query the `automatic_clustering_history` table, here's an example query: ```sql SELECT start_time, end_time, table_name, schema_name, database_name, credits_used, num_bytes_reclustered, num_rows_reclustered, DATEDIFF('minute', start_time, end_time) AS duration_minutes FROM snowflake.account_usage.automatic_clustering_history WHERE table_name = 'FCT_BEHAVIOR_STRUCTURED_EVENT' -- Replace with your table name AND schema_name = 'COMMON' -- Replace with your schema AND database_name = 'PROD' -- Replace with your database AND start_time >= '2025-12-17' -- When clustering was enabled ORDER BY start_time DESC; ``` Key Metrics: `credits_used`: Snowflake credits consumed by automatic clustering operations `num_bytes_reclustered`: Amount of data reorganized (in bytes) `num_rows_reclustered`: Number of rows reorganized `duration_minutes`: How long the clustering operation took ## Best Practices 1. Choose clustering keys wisely based on your query patterns Loading