Use object size estimates to control when the GC starts
Currently the GC kicks in when a certain number of blocks has been allocated. While straightforward to implement, this can lead to memory bloat. For example, when allocating four 1GB strings the garbage collector will only allocate up to 1 block, as 1020 objects fit in a block. At this point the used memory is 4GB + 32 KB. If these strings go out of scope, the GC won't kick in, instead we have to wait until we cross the block allocation threshold. This in turn leads to the program retaining the 4GB for much longer than necessary.
One way of solving this is to add Object::estimated_size()
. This method would return a usize
containing the estimated size of the object. For a String the size would be calculated as follows:
mem::size_of::<Object>()
+ mem::size_of::<String>()
+ ( string_value.len() * 8 )
For an Array we just do:
mem::size_of::<Object>()
+ mem::size_of::<Vec<ObjectPointer>>()
+ ( array_value.len() * mem::size_of::<ObjectPointer>() )
Here we don't need to recurse into the ObjectPointer values stored in the Array, as the underlying object sizes have already been added to the allocation estimate.
With this data available, instead of using a block count threshold, we use an "estimated byte allocation" threshold. This could lead to more aggressive garbage collecting, so we may also need to increase the amount of survivor spaces to prevent premature promotion to the mature generation.