Skip to content
  • Yorick Peterse's avatar
    Rewrite the process scheduler from the ground up · 3e5882be
    Yorick Peterse authored
    The old scheduler was a little over two years old, and due for a
    rewrite. While it worked, it was not very efficient and many features
    were bolted on top; process pinning being an example.
    
    The new scheduler relies less heavy on locking, only using mutexes
    paired with condition variables to wake up sleeping threads. This will
    allow it to scale much better as the number of threads goes up.
    
    Another big benefit is clearer code. The old scheduler's code was a
    mess, largely because we focused more on getting a proof of concept out
    instead of building a scheduler for the next few years.
    
    == Suspending and rescheduling processes
    
    As part of this rewrite, the way timeouts and rescheduling of processes
    is handled is also rewritten. When a process is suspended and receives a
    message, the sender will try to reschedule it immediately. This makes
    sending messages a little bit more expensive, but allows for much faster
    rescheduling of processes. This also removes the need for a separate
    thread to perform a linear scan over a list of processes to determine
    which ones need to be rescheduled.
    
    Processes that suspend themselves with a timeout are stored in a binary
    heap, managed by a separate thread. Communication with this thread is
    done using a channel, offloading most of the work to the separate
    timeout thread. When a process with a timeout is rescheduled, its entry
    in the heap is marked as invalid instead of being removed. This makes
    the operation a constant time operation, at the cost of the binary heap
    getting fragmented. To combat fragmentation, the timeout thread will
    periodically remove invalid entries from the heap.
    
    Rescheduling processes is done entirely using atomic operations, instead
    of using mutexes. This requires some careful coding to take into account
    multiple threads trying to reschedule the same process, but should allow
    all of this to scale much better.
    
    The new approach of suspending and rescheduling processes requires one
    additional word of memory per process. This memory is used to mark the
    process as suspended, and to optionally store a pointer to its timeout
    (if one was used).
    
    == Message counts
    
    The number of messages in a mailbox is now stored explicitly using an
    atomic integer, instead of obtaining this from the synchronised
    data structures internal to a mailbox. This requires one word of extra
    memory per process, but makes it much cheaper to check if a process has
    messages. This is important, because when rescheduling a process such
    checks are performed several times.
    
    == Asynchronous IO and further improvements
    
    While this commit does not add support for asynchronous IO operations,
    the rewrite will make it easier to do so in future commits. The process
    lookup table also remains unchanged, but we're currently investigating
    if we can get rid of PIDs and the lookup table entirely; potentially
    speeding up process spawning by quite a bit.
    3e5882be