Graphql Int type is too small for namespace statistics
Problem
In ruby, we don't deal with Int
vs Long
, the language deals with it automatically.
In GraphQL, the GraphQL::INT_TYPE
does not behave the same; it is defined as a signed 32bit integer.
Our namespace statistics can easily be bigger than 2GB and
#32438 (closed) shows that we have values over that threshold on namespaces like gitlab-org
and gitlab-com
.
Both Project
and Namespace
statistics data type should be able to express values higher than 2GB, and this poses a problem because the provided GraphQL::Types::BigInteger
is encoded as a string, breaking the API.
As described in #32292 (closed), the golden standard with GraphQL is not to version (or need to version) your API. Here I would like to discuss this topic with a specific example.
Solutions
Option 1 - breaking the API
The easiest thing to do is breaking the API and switching to GraphQL::Types::BigInteger
, this requires us to update the frontend as well, but we can do it in the same MR.
This breaks any external integration based on this API as the numbers are now presented as strings.
Option 2 - 53bit integers
Implement a new scalar type based on Javascript Number.MAX_SAFE_INTEGER
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Number/MAX_SAFE_INTEGER
This type gives us up to 2^{53}-1 = 9007199254740991 \approx 9 Petabytes
Technically it's still an API change, but the encoded answer does not change.
Option 3 - 2 steps deprecation
Use GraphQL::Types::BigInteger
and deprecate
This change requires much work distributed across several releases,
I'll detail the steps here for Namespace.root_storage_statistics
but it's required for Project.statistics
as well.
It can be summarized as:
- Rename
Types::RootStorageStatisticsType
toTypes::Deprecated::RootStorageStatisticsType
- Create
Types::RootStorageStatisticsType
with fields made of BigInteger - Add a new field in
Namespace
with the new type (i.e.Namespace.root_storage_statistics_big_int
) - Fix the frontend code
- Wait for the deprecation window
- Delete
Types::Deprecated::RootStorageStatisticsType
- Change
Namespace.root_storage_statistics
toTypes::RootStorageStatisticsType
- Deprecate
Namespace.root_storage_statistics_big_int
- Fix the frontend code again
- Wait for the deprecation window
- Delete
Namespace.root_storage_statistics_big_int
Initial state
field :root_storage_statistics, Types::RootStorageStatisticsType,
null: true,
description: 'The aggregated storage statistics. Only available for root namespaces',
resolve: -> (obj, _args, _ctx) { Gitlab::Graphql::Loaders::BatchRootStorageStatisticsLoader.new(obj.id).find }
Deprecation
field :root_storage_statistics, Types::Deprecated::RootStorageStatisticsType,
null: true,
deprecated: "please use 'root_storage_statistics_big_int'",
resolve: -> (obj, _args, _ctx) { Gitlab::Graphql::Loaders::BatchRootStorageStatisticsLoader.new(obj.id).find }
field :root_storage_statistics_big_int, Types::RootStorageStatisticsType,
null: true,
description: 'The aggregated storage statistics. Only available for root namespaces',
resolve: -> (obj, _args, _ctx) { Gitlab::Graphql::Loaders::BatchRootStorageStatisticsLoader.new(obj.id).find }
Second deprecation
field :root_storage_statistics, Types::RootStorageStatisticsType,
null: true,
description: 'The aggregated storage statistics. Only available for root namespaces',
resolve: -> (obj, _args, _ctx) { Gitlab::Graphql::Loaders::BatchRootStorageStatisticsLoader.new(obj.id).find }
field :root_storage_statistics_big_int, Types::RootStorageStatisticsType,
null: true,
deprecated: "please use 'root_storage_statistics'",
resolve: -> (obj, _args, _ctx) { Gitlab::Graphql::Loaders::BatchRootStorageStatisticsLoader.new(obj.id).find }
Final state
field :root_storage_statistics, Types::RootStorageStatisticsType,
null: true,
description: 'The aggregated storage statistics. Only available for root namespaces',
resolve: -> (obj, _args, _ctx) { Gitlab::Graphql::Loaders::BatchRootStorageStatisticsLoader.new(obj.id).find }
Preventing this from happening again
Regardless of the option we chose, should we consider writing a GraphQL::INT_TYPE
limits?