1. 18 Mar, 2019 1 commit
    • Yorick Peterse's avatar
      Rewrite the process scheduler from the ground up · 3e5882be
      Yorick Peterse authored
      The old scheduler was a little over two years old, and due for a
      rewrite. While it worked, it was not very efficient and many features
      were bolted on top; process pinning being an example.
      
      The new scheduler relies less heavy on locking, only using mutexes
      paired with condition variables to wake up sleeping threads. This will
      allow it to scale much better as the number of threads goes up.
      
      Another big benefit is clearer code. The old scheduler's code was a
      mess, largely because we focused more on getting a proof of concept out
      instead of building a scheduler for the next few years.
      
      == Suspending and rescheduling processes
      
      As part of this rewrite, the way timeouts and rescheduling of processes
      is handled is also rewritten. When a process is suspended and receives a
      message, the sender will try to reschedule it immediately. This makes
      sending messages a little bit more expensive, but allows for much faster
      rescheduling of processes. This also removes the need for a separate
      thread to perform a linear scan over a list of processes to determine
      which ones need to be rescheduled.
      
      Processes that suspend themselves with a timeout are stored in a binary
      heap, managed by a separate thread. Communication with this thread is
      done using a channel, offloading most of the work to the separate
      timeout thread. When a process with a timeout is rescheduled, its entry
      in the heap is marked as invalid instead of being removed. This makes
      the operation a constant time operation, at the cost of the binary heap
      getting fragmented. To combat fragmentation, the timeout thread will
      periodically remove invalid entries from the heap.
      
      Rescheduling processes is done entirely using atomic operations, instead
      of using mutexes. This requires some careful coding to take into account
      multiple threads trying to reschedule the same process, but should allow
      all of this to scale much better.
      
      The new approach of suspending and rescheduling processes requires one
      additional word of memory per process. This memory is used to mark the
      process as suspended, and to optionally store a pointer to its timeout
      (if one was used).
      
      == Message counts
      
      The number of messages in a mailbox is now stored explicitly using an
      atomic integer, instead of obtaining this from the synchronised
      data structures internal to a mailbox. This requires one word of extra
      memory per process, but makes it much cheaper to check if a process has
      messages. This is important, because when rescheduling a process such
      checks are performed several times.
      
      == Asynchronous IO and further improvements
      
      While this commit does not add support for asynchronous IO operations,
      the rewrite will make it easier to do so in future commits. The process
      lookup table also remains unchanged, but we're currently investigating
      if we can get rid of PIDs and the lookup table entirely; potentially
      speeding up process spawning by quite a bit.
      3e5882be
  2. 22 Dec, 2018 1 commit
    • Yorick Peterse's avatar
      Added method Float.to_bits · a4c2d681
      Yorick Peterse authored
      This method can be used to obtain the bitwise representation of a Float.
      This in turn can be used to perform an approximate equality comparison
      by checking bits of a Float.
      a4c2d681
  3. 16 Dec, 2018 1 commit
    • Yorick Peterse's avatar
      Parsing of Strings into Floats and Integers · 6558d6b5
      Yorick Peterse authored
      This adds support for parsing a String into a Float and an Integer.
      There are two ways of doing so:
      
      1. By sending `to_integer` or `to_float` to a `String`.
      2. Using `Integer.from_string` or `Float.from_string`.
      
      Using `to_integer` and `to_float` will perform a lossy conversion:
      returning 0 or 0.0 for invalid input. Using the `from_string` methods
      will result in a strict conversion, with an error being thrown for
      invalid input.
      
      Fixes #134
      Fixes #156
      6558d6b5
  4. 11 Dec, 2018 1 commit
    • Yorick Peterse's avatar
      Move std::reflection into std::mirror · b3c7e36f
      Yorick Peterse authored
      This moves all code from `std::reflection` into `std::mirror`, finally
      removing the need for the two separate modules. We also renamed
      `kind_of?` to `implements_trait?`, and implemented both it and
      `instance_of?` in pure Inko. This in turn allows us to remove some
      specialised VM instructions.
      
      Fixes #153
      b3c7e36f
  5. 08 Dec, 2018 1 commit
    • Yorick Peterse's avatar
      Removed storing of entire process statuses · 86cadf10
      Yorick Peterse authored
      Obtaining the process status has always been a bit questionable. For
      one, it's not particularly useful to see that a process is running or
      being garbage collected. Second, it requires a full 8 bytes of memory
      per process to store.
      
      In this commit, we drop the storing of full process statuses, and add a
      boolean flag "waiting for message" that we use instead where necessary.
      Currently this won't reduce the size of a process due to alignment
      requirements, but in the future we may be able to work around this by
      reducing the size of other fields.
      86cadf10
  6. 30 Oct, 2018 1 commit
    • Yorick Peterse's avatar
      Add a Foreign Function Interface for C code · 26f535fe
      Yorick Peterse authored
      This commit adds support for a basic Foreign Function Interface to C.
      This interface allows Inko code to dynamically load C libraries, obtain
      pointers to variables, and call functions. Data types are automatically
      converted whenever possible. Passing arbitrary Inko objects to C is not
      possible, as otherwise the garbage collector could release memory of
      objects still in use by C code.
      26f535fe
  7. 11 Oct, 2018 1 commit
    • Yorick Peterse's avatar
      Rework handling of prototypes of built-in types · f39ce802
      Yorick Peterse authored
      This reworks how the prototypes of built-in types, such as ByteArray,
      are handled. Prior to this commit, various built-in types would use
      Object as their prototype, followed by the runtime correcting this. This
      required the use of `std::reflection` in various places, which would add
      unnecessary runtime overhead.
      
      Supporting FFI was also made more complicated in this setup. For
      example, in the FFI API a Pointer should be an instance of
      `std::pointer::Pointer`, but the VM has no built-in knowledge of this
      type, meaning it had to use Object as the prototype. This then required
      the FFI runtime code to fix the prototype every time a Pointer was
      created. This complicates the code, and in certain places requires
      different approaches to fix the prototype.
      
      In this commit, we make sure that all built-in types have a dedicated
      prototype in the VM. We also merge the various GetFooPrototype
      instructions into a single GetBuiltinPrototype instruction, reducing the
      number of instructions necessary. The compiler still exposes separate
      virtual instructions, though this is mostly to keep the prototype IDs
      out of the runtime.
      
      == Setting object names
      
      This new setup requires that for a few more types we get the prototype
      and set the object name manually. To make this easier, the compiler now
      supports the virtual instruction `set_object_name`. This allows modules
      to set the correct object name, without having to use the
      `@_object_name` instance attribute directly. This in turn means this
      attribute is now only used in two places:
      
      1. In the compiler, where it belongs.
      2. In `std::mirror`, in order to obtain the name of the object.
      
      == DefaultHasher is now in a separate module
      
      The DefaultHasher type used to reside in `std::hash_map`, but this
      didn't make much sense since it's not tied into the `HashMap` type. With
      the various prototype changes being made I decided that now was a good
      time to move `DefaultHasher` to its own module: `std::hasher`. In the
      future this module might provide hashers using other algorithms, but for
      now it only defines the `DefaultHasher` type.
      f39ce802
  8. 07 Oct, 2018 1 commit
    • Yorick Peterse's avatar
      Pinning of processes to threads · faf02ea7
      Yorick Peterse authored
      This commits adds the ability to pin a process to a particular OS thread
      in a thread pool. This is useful for the FFI, as certain C functions or
      data structures require to be run on a specific thread. For example,
      libc's "errno" variable uses thread-local storage. This means that if we
      want to run a function that uses it and read "errno", we _have_ to
      ensure both operations are performed on the same thread.
      
      To support this, each Pool structure now has a Worker structure, which
      stores the thread ID of that worker. This ID can then be stored in a
      process, allowing the scheduler to determine which worker should run the
      process.
      
      == Process pinning
      
      Pinned processes can not be moved across pools, so the MoveToPool
      instruction becomes a noop for a pinned process. We can not panic in
      this case, as this would prevent pinned processes from using methods
      that try to move a process to the secondary pool.
      faf02ea7
  9. 06 Sep, 2018 1 commit
    • Yorick Peterse's avatar
      Add support for deferred execution of blocks · 703ff73c
      Yorick Peterse authored
      The method `std::process.defer` can be used to schedule a block for
      execution when the calling scope returns, even when a panic is
      triggered. Such blocks are useful when cleaning up resources, such as
      files and sockets.
      
      Most languages use finalisers for this, but finalisers are difficult to
      implement in Inko. The garbage collector has to be able to deal with
      objects that are resurrected, and somehow be able to schedule finalisers
      for execution. Usage wise, finalisers are also more limited as they are
      defined when defining an object, not when using it. This means you
      (typically) can't create ad-hoc finalisers.
      
      The use of deferred blocks allows us to work around these restrictions
      and implementation difficulties, and the idea is inspired by Go. One
      downside is that directly using `std::process.defer` can lead to rather
      verbose code, but more high-level abstractions can be added on top
      easily.
      
      The order in which blocks are executed is currently not officially
      specified, and thus should not be relied upon. Currently the order will
      be First In Last Out (FILO), but this may change at any given point in
      time.
      
      Using `std::process.defer` is very straightforward. For example, to
      close a file once we are done with it, you'd write something along the
      lines of the following:
      
          import std::process
          import std::fs::file
      
          let readme = try! file.read_only('README.md')
      
          process.defer {
            readme.close
          }
      
          try! readme.read_string
      703ff73c
  10. 26 Aug, 2018 2 commits
    • Yorick Peterse's avatar
      Add support for registering panic handlers · 5e6920e3
      Yorick Peterse authored
      A panic handler is a block to execute when a process panics. Once the
      block finishes running, the process terminates. A panic handler can be
      registered using `std::process.panicking`:
      
          import std::process
          import std::stdio::stderr
          import std::vm
      
          process.panicking do (error) {
            stderr.print(error)
          }
      
          vm.panic('oops!')
      
      Each process can only register a single panic handler, and newly
      registered handlers will overwrite any previous ones.
      
      The block passed to `std::process.panicking` is given the panic message.
      A stacktrace needs to be obtained manually using
      `std::debug.stacktrace`.
      5e6920e3
    • Yorick Peterse's avatar
      Explicitly bind receivers to blocks and bindings · 79103c8e
      Yorick Peterse authored
      Prior to this commit, "self" was just syntax sugar for obtaining local
      variable 0. This variable in turn was populated by the (implicit)
      argument 0. In other words, this:
      
          def foo(bar) {
            self
          }
      
          foo
      
      Was more or less translated into the following:
      
          def foo(self_arg, bar) {
            self_arg
          }
      
          foo(self, bar)
      
      While fairly simple to implement, this poses two problems:
      
      1. "self" is an implicit argument, which can be confusing for users. For
         example, when using mirrors to obtain the list of method arguments,
         "self" would be included in the list.
      
      2. The VM could not schedule a Block for execution on its own, because
         it doesn't know what object to pass as the first argument (= self).
      
      Problem 2 made it impossible to implement panic hooks in a nice way, and
      any future features that require the VM to schedule blocks (e.g. when
      trapping signals).
      
      To work around this, "self" is now explicitly bound to blocks, when they
      are defined. To execute methods, we use a new instruction:
      RunBlockWithReceiver. This instruction takes a receiver (= the object to
      use for "self") to use when executing the method. The receiver in this
      case will be the object the method was invoked on.
      
      When a block is created, we no longer create a new binding for it.
      Instead, we store the binding that the block captures, which we later
      set as the parent for the new binding when executing the block. This
      removes the need for allocating a Binding for every block that is
      defined, even when never executed. The block also stores the receiver to
      use when executing said block.
      
      This setup also means we can remove quite a bit of nasty bits from the
      compiler, as we no longer need to generate implicit arguments and local
      variables. It also allows us to (in the future) obtain the receiver of a
      block (for meta programming), without having to rely on the exact local
      variable index used to store this object.
      79103c8e
  11. 21 Aug, 2018 1 commit
    • Yorick Peterse's avatar
      Added std::env for managing environment data · 69a7592e
      Yorick Peterse authored
      The module std::env can be used for obtaining and setting environment
      variables, the home directory, the temporary directory, and more. For
      example, one can obtain environment variable values as follows:
      
          import std::env
      
          env['HOME'] # => '/home/yorickpeterse'
      
      You can also set the value of a variable:
      
          import std::env
      
          env['HOME'] = '/home/foo'
      
      Removing variables is also possible:
      
          import std::env
      
          env.remove('HOME') # => Nil
      
      Or obtain the home directory:
      
          import std::env
      
          env.home_directory # => '/home/yorickpeterse'
      
      You can also obtain and set the working directory:
      
          import std::env
      
          try! env.working_directory          # => '/home/yorickpeterse'
          try! env.working_directory = '/tmp' # => '/tmp'
      
      Arguments can be retrieved using `std::env.arguments`:
      
          import std::env
      
          env.arguments # => ['foo', 'bar']
      
      == Executable changes
      
      The "inko" executable has been modified to pass additional commandline
      arguments to IVM. IVM in turn has been modified to expose these to the
      VM instructions. This requires us to explicitly store the passed
      arguments in a vm::state::State, as Rust's std::env::args() is immutable
      _and_ includes _all_ arguments (the bytecode file to execute, IVM
      options, etc).
      
      The arguments passed via the CLI are all interned, removing the need for
      allocating (potentially many) strings every time `std::env.arguments` is
      executed.
      
      Fixes #136
      69a7592e
  12. 04 Jul, 2018 1 commit
    • Yorick Peterse's avatar
      Turn Boolean back into a regular object · 6e073c2a
      Yorick Peterse authored
      Defining Boolean as a Trait was a nice idea, but it was too limiting.
      For example, the Inspect trait is defined after Boolean, meaning you
      couldn't pass True or False to an argument expecting an Inspect, since
      Boolean did not define Inspect as a required trait.
      
      To work around this we turn Boolean back into a regular object, and make
      True and False instances of this object. This allows us to refine these
      objects whenever necessary, at the cost of having to implement said code
      for Boolean, True, False.
      6e073c2a
  13. 20 Jun, 2018 1 commit
    • Yorick Peterse's avatar
      Use byte arrays for IO related operations · c5897da0
      Yorick Peterse authored
      This completely reworks the IO system of both IVM and the Inko runtime.
      Instead of operating on strings or arrays of integers, the VM now
      operates on proper byte arrays. These byte arrays are stored as a
      Vec<u8> in the VM, instead of a Vec<ObjectPointer>. This ensures no
      space is wasted by storing bytes as 8 byte values.
      
      The standard library now operates on a ByteArray in various places,
      instead of using Array!(Integer). This ByteArray type is defined in the
      std::byte_array module, and acts similar to an Array. Unlike an Array,
      writing an out of bounds index will panic, as there is no reasonable
      default value to use for padding the byte array.
      
      Some related code has also been changed. For example,
      std::string.from_bytes has been moved to ByteArray.to_string and
      ByteArray.drain_to_string. String.to_bytes in turn has been moved to
      ByteArray.to_string.
      
      All of these changes combined means that reading a 64 MB file only
      requires about 70 MB in total, instead of requiring around 1 GB.
      
      Fixes #108
      c5897da0
  14. 15 Jun, 2018 1 commit
    • Yorick Peterse's avatar
      Added a StringBuffer object · 852e0084
      Yorick Peterse authored
      This object can be used to efficiently concatenate multiple strings
      together, without allocating intermediate String objects on the heap.
      852e0084
  15. 14 Jun, 2018 1 commit
    • Yorick Peterse's avatar
      Added StringFormatDebug · 2c17fe95
      Yorick Peterse authored
      This instruction can be used to format a string for debugging purposes.
      For example, the string `hello` would be turned into `"hello"`. This is
      implemented using a VM instruction to keep things simple and fast.
      2c17fe95
  16. 09 Jun, 2018 1 commit
  17. 15 Mar, 2018 1 commit
  18. 13 Mar, 2018 1 commit
  19. 11 Mar, 2018 1 commit
    • Yorick Peterse's avatar
      Added support for obtaining stack traces · a013b2eb
      Yorick Peterse authored
      The method `vm.stacktrace` can be used to obtain a stack trace leading
      up to the `vm.stacktrace` method call. This can be useful when for
      example building a unit testing module that wants to display the call
      stack in the event of a test failure.
      a013b2eb
  20. 26 Feb, 2018 1 commit
    • Yorick Peterse's avatar
      Removed Array capacity/reserve instructions · 9209a4f6
      Yorick Peterse authored
      These are redundant because reserving space can be done by just
      inserting a value at an index beyond the length. For example, to reserve
      four values you'd insert a Nil at index 3. This does result in Nil
      values being used to fill up empty slots, but this doesn't require any
      more space so it's not a big deal.
      9209a4f6
  21. 25 Feb, 2018 1 commit
    • Yorick Peterse's avatar
      Expose array capacities to the runtime · 4a2352aa
      Yorick Peterse authored
      This allows code to reserve space and obtain the number of values that
      can be stored before a resize. This in turn can be used by data
      structures such as HashMap to figure out when rehashing is necessary.
      4a2352aa
  22. 24 Feb, 2018 1 commit
  23. 23 Feb, 2018 1 commit
    • Yorick Peterse's avatar
      Added String.+ · b1e56fe9
      Yorick Peterse authored
      This method concatenates two strings together, producing a new one.
      b1e56fe9
  24. 22 Feb, 2018 1 commit
    • Yorick Peterse's avatar
      Turn Boolean back into a trait and rework AND/OR · 99a57b22
      Yorick Peterse authored
      A few months ago I turned Boolean from a trait into an object. Back then
      the trait setup for Boolean was rather horrible and I thought using a
      regular object would be better. Unfortunately the use of an object
      brought various issues with it. For example, every method defined on
      True or False would also have to be defined (as a dummy method) on
      Boolean itself. It was also impossible to support code such as this:
      
          def example(other: do -> Boolean) -> Boolean {
            if true: {
              True
            }, false: {
              other.call
            }
          }
      
      This wouldn't work because "Boolean" is not guaranteed to implement the
      same methods as "True".
      
      One solution to this problem would be to introduce union types and
      define "Boolean" as a union like so:
      
          type Boolean = True | False
      
      While tempting I feel that union types are the wrong approach. Every
      union type can be replaced by a trait and traits in general are much
      more pleasant to work with. For example, consider the following union
      type:
      
          type Number = Integer | Float
      
      Now let's say this type is defined somewhere in the standard library or
      some other piece of code we can't easily modify, and we use it in a
      whole bunch of places. If we at some point want to also support a
      Complex or Rational we'd have to either define our own type, or somehow
      ask the author of the "Number" type to extend it. Both cases are a pain.
      
      With traits on the other hand this would not be an issue as we can
      simply implement it where necessary and we're good to go.
      
      Because of the above issues we now define "Boolean" as a trait. To
      support this one can now define a trait and later redefine it, but
      _only_ if the trait is empty. This allows us to define "Boolean" in the
      "std::bootstrap" module and later refine it in "std::boolean". Without
      this we would not be able to bootstrap the runtime as various modules
      depend on "Boolean" being present from the very beginning.
      
      With these changes we also no longer need the GetBooleanPrototype
      instruction and thus it has been removed. We also removed && and || in
      favour of "and" and "or". Both these methods take a block that is only
      evaluated when necessary. This means that instead of this:
      
          foo && bar
      
      You would write:
      
          foo.and { bar }
      
      This has the added benefit of automatically grouping expressions, making
      it easier to chain message sends. For example, instead of this:
      
          (foo && bar).if_true {
            ...
          }
      
      You can now write this:
      
          foo.and { bar }.if_true {
            ...
          }
      
      To make all of this work I also had to make some changes to the type
      system. In particular we need to support downcasting of blocks in
      certain cases. Take the following piece of code for example:
      
          def or(other: do -> Boolean) -> Boolean {
            if true: {
              True
            }, false: {
              other.call
            }
          }
      
      Here the block passed to `true:` would be inferred as `do -> True` while
      the block passed to `false:` would be inferred as `do -> Boolean`. This
      would then produce a type erro because `do -> Boolean` is not compatible
      with `do -> True`.
      
      To support this we now check if in such cases we can downcast the
      expected type (`do -> True` in this example) to the given type.
      Currently we only downcast the return type of a block, but we may
      add support for other cases in the future.
      99a57b22
  25. 21 Feb, 2018 1 commit
    • Yorick Peterse's avatar
      Implement std::time::Time using mostly Inko · 63bc941d
      Yorick Peterse authored
      This changes the implementation of std::time::Time so it no longer
      relies on high-level date/time instructions provided by the VM. Instead
      of these instructions we now use an algorithm ported over from musl to
      decompose a Unix timestamp. The individual parts of a Time can in turn
      be used to reconstruct a Unix timestamp. The VM in turn only provides
      some low level instructions to get a timestamp, the offset, and a flag
      indicating if DST is active.
      
      A side effect of this approach is that we have to implement time parsing
      and formatting ourselves. While this will take some time (e.g. we need
      to implement hash maps first) all if this work is necessary anyway. On
      top of that both time parsing and formatting was somewhat broken when
      using the "time" crate anyway. Until time parsing/formatting is
      reimplemented the methods have been changed so they produce dummy data
      or throw an error.
      63bc941d
  26. 17 Feb, 2018 2 commits
    • Yorick Peterse's avatar
      Added time formatting and parsing methods · 1d72e091
      Yorick Peterse authored
      This is where the "time" crate is starting to show it's ugly side. For
      example, parsing the string "2018" using the format "%Y" will result in
      the day of the year being set to 0, instead of 1. This is because the
      "time" crate (and the underlying C functions) start the day of the year
      with 0, whereas ISO 8601 starts it at 1.
      
      A workaround would be to fix the Tm structure returned before wrapping
      it in a DateTime, but another alternative might be to just use the
      "chrono" crate after all. I'll need to give this some thought to see
      what solution we end up going with.
      1d72e091
    • Yorick Peterse's avatar
      f467c23b
  27. 15 Feb, 2018 2 commits
  28. 14 Feb, 2018 2 commits
  29. 13 Feb, 2018 1 commit
    • Yorick Peterse's avatar
      Added a dedicated date time object to the VM · 0d355b4e
      Yorick Peterse authored
      While fiddling around with Unix timestamps manually would allow us more
      control over how "std::time" works it's also a big undertaking. Worse,
      it's hard to find good resources explaining how to properly deal with
      leap years and what not.
      
      To work around this the VM now offers a dedicated date time object. This
      object does not have a prototype but the runtime can easily add one
      using the "SetPrototype" instruction. This removes the need for having
      to add _another_ type to the bootstrapping process.
      
      Using this date time object is currently possible using two
      instructions:
      
      * TimeSystem
      * TimeGetValue
      
      The TimeSystem instruction simply allocates an object containing
      information about the current system time. The TimeGetValue instruction
      can be used to read a specific value from this object.
      
      Internally we use the "time" crate for retrieving date/time information.
      While this crate is deprecated and "chrono" is usually recommended I
      found the "chrono" crate to be quite painful to use. The APIs are
      confusing and just weren't pleasant to use. Apart from that I felt
      uneasy including a crate with so many features when Inko only needs a
      very small portion. The "time" crate offers the features we need and
      nothing more, making it perfect until something better comes along.
      0d355b4e
  30. 11 Feb, 2018 3 commits
  31. 09 Feb, 2018 2 commits
  32. 08 Feb, 2018 2 commits