Use thread local pool for parallel GC
Summary
Inko spawns threads when collection happens, but it's possible to use thread local thread pool for this.
Motivation
In Not beating C with 96 lines of Inko Inko GC show slow collection cycle and I'm 100% sure that's because runtime spawns threads for each collection instead of using thread pool. Spawning native threads is basically always expensive, for example on my machine it costs ~200.000ns to spawn one thread!
Implementation
Implementation may look like this (that's pseudocode):
thread_local! {
static POOL: ThreadPool = ThreadPool::new(num_cpus::get());
}
fn trace(proc: Process) {
let global_queue = Queue::new(proc.roots());
for _ in 0..num_cpus::get() {
POOL.with(|pool| pool.schedule(MarkingTask::new(global_queue));
}
POOL.with(|pool| pool.wait());
}
Drawbacks
Thread pool will make threads for tracing alive for program live and this means process_workers + gc_pool_threads + timeout_worker_thread + parallel_gc_threads * gc_pool_threads
will be alive through entire execution.
Edited by Yorick Peterse