Skip to content

Reuse thread pools for GC tracer threads

Yorick Peterse requested to merge reuse-tracer-pool into master

The pools used for tracing objects are now created when spawning GC coordinators, instead of spawning threads for every garbage collection cycle. Spawning threads can take between 10 and 20 microseconds, while creating the pool data structures itself can take around 100 microseconds. On systems with many threads (e.g. 32), this can easily lead to a GC coordinator spending at least 500 microseconds just setting everything up.

The implementation is a bit unique. Each tracer pool as a "Broadcast" type that can be used to wake up the tracer threads. When woken up, they receive the process (and some other data) to trace. The last thread receiving the value clears it. This setup means that waking up threads is a constant-time operation, taking only about 4 microseconds. Initially I used a channel per thread, but this requires 3-4 microseconds per thread to wake them up.

In addition to these changes, the number of tracer threads now (once again) equals the number of CPU cores; instead of being limited to half the number of cores. With the new pool setup in place I did some testing, and I found that on an 8-core machine the GC performs better when using 8 cores for tracing, instead of only using 4 cores.

This fixes #191 (closed)

Merge request reports