1. 30 Sep, 2020 1 commit
    • Yorick Peterse's avatar
      Implement Inko's compiler in Inko · d089e8ff
      Yorick Peterse authored
      Inko's compiler is now written in Inko itself, and the source code is
      located in the std::compiler module tree.
      == "where" replaced with "when"
      The "where" keyword has been replaced with "when". The pattern matching
      syntax (discussed below) uses "when", and instead of having both "where"
      and "when" we opted to just go with "when".
      == Pattern matching
      We also introduce pattern matching. Pattern matching is introduced as it
      makes various parts of the compiler easier to write. For example,
      instead of using the visitor pattern the compiler can rely on pattern
      matching; resulting in less boilerplate code.
      The pattern matching implementation is deliberately kept simple, and
      inspired mostly by Kotlin's "when" expression. This means no support for
      destructuring input into separate variables, and no support for checking
      if an input type is a generic type (due to type erasure).
      Pattern matching is performed using the "match" keyword, parentheses
      around the expression to match are required:
          match(process.receive) {
      We can check if the input is of a certain type using the "as" pattern:
          match(process.receive) {
            as String -> { ... }
      When using this pattern, we can also supply an additional guard:
          match(process.receive) {
            as String when something -> { ... }
      This is useful when the input value is bound to a variable, which can be
      done by using the "let" keyword _inside_ the parentheses. This allows
      for more specific checks:
          match(let message = process.receive) {
            as String when message == 'foo' -> { ... }
      A fallback case is specified using the "else" keyword:
          match(let message = process.receive) {
            as String when message == 'foo' -> { ... }
            else -> { ... }
      We can also match arbitrary expressions as patterns. This requires that
      the pattern we are looking for implements the std::operators::Match
          let number = 4
          match(number) {
            1 -> { 'number one' }
            2..4 -> { 'between 2 and 4' }
            else -> { 'something else' }
      When matching expressions, we can also specify a "when" guard:
          let number = 4
          match(number) {
            1 when some_condition -> { 'first' }
            1 -> { 'second' }
            else -> { 'third' }
      We can also leave out the expression to match against, in which case
      "match" acts like an if-chain:
          match {
            foo? -> { 'foo'}
            bar? -> { 'bar' }
            else -> { 'else' }
      When using this syntax, "as" patterns are not supported, and the
      expressions must produce a Boolean (instead of implementing the Match
      The return type of a "match" expression is either the type of the first
      case (or of the "else" case if no patterns are present), or Dynamic if
      the cases return different types. This allows you to write patterns that
      return different types when you don't care about those types (e.g.  you
      never use the returned value). If all cases return a value of type "T",
      but the fallback case returns Nil, the type is inferred to "?T".
      == Local and non-local throw, return, and try expressions
      The keywords `return`, `throw`, and `try` now all operate on the method
      level. This means that `throw` for example will throw from the
      surrounding method, not just the surrounding closure. These are called
      non-local expressions, since they are not scoped to the surrounding
      Local returns, throws, and try expressions are supported using the
      following keywords:
      * `local return`
      * `local throw`
      * `local try`
      These all unwind from/operate on the surrounding closure. These changes
      ensure that all these keywords operate consistently. Type compatibility
      checks have also been changed so that you can no longer assign a
      throwing closure to an argument or field that doesn't expect one. For
      example, this is no longer valid:
          def foo(block: do -> Integer) -> Integer {
          foo {
            local throw 10
      Here `local throw` results in the closure being inferred as
      `do !!  Integer -> Integer`, which is not compatible with
      `do -> Integer`.
      == core and std merger
      The modules core::bootstrap and core::prelude have been moved to
      std::bootstrap and std::prelude respectively. There isn't really a
      reason to introduce a separate namespace for these two modules. In
      fact, doing so complicates the compiler in a few places; for no good
  2. 29 Sep, 2020 1 commit
    • Yorick Peterse's avatar
      Overhaul the build process and the CLI · b64323fe
      Yorick Peterse authored
      All Make files have been merged into a single top-level Makefile, which
      in turn has been cleaned up. The CLI has also been revamped
      == Easier installation process
      The installation process has also been simplified into two steps:
          make build
          make install
      If files need to be installed in a different location, such as a
      packaging jail/chroot, you can run the following:
          make build
          make install DESTDIR=example/directory
      We also support the PREFIX variable better now. By default DESTDIR is
      set to PREFIX, but you can also set them separately. This way you can
      customise where the compiler will look for source files using PREFIX,
      while still being able to install the files into a (temporary) directory
      using DESTDIR. For example:
          make build
          make install PREFIX=/usr/local DESTDIR=/tmp/package-chroot
      == Removal of pre-compiled packages
      As part of these changes, we stop providing pre-compiled binaries in the
      S3 bucket releases.inko-lang.org. This is discussed in
      #218, but in short:
      1. Few users are likely to use and benefit from these pre-compiled
      2. Because of the compile-from-source fallback you still have to install
         all compile-time dependencies.
      3. Focusing on package managers (e.g. including Inko in the AUR) is more
      4. The compiler packages may stop working in the future, if any of the
         OS' decide to change things such that previously compiled binaries
         have to be recompiled.
      5. All of this requires a fair amount of complexity on our end, with
         little to no benefit.
      == New CLI
      This adds a new "inko" executable written in Rust, replacing the
      "inko-test", "inkoc", and "inko" Ruby executables, and the "ivm"
      This new executable uses sub-commands for building Inko source files,
      running tests, etc. This executable has various paths, such as the path
      to the compiler, compiled into itself. This ensures that the "inko"
      executable always uses the correct version of the compiler, runtime,
      Usage of the new CLI is as follows:
          inko test.inko          # Compiles and runs test.inko
          inko run test.inko      # Same thing
          inko run test.ibi       # Run a bytecode image
          inko build test.inko    # Just compiles test.inko
          inko test               # Runs all unit tests in ./tests/test
      Since the compiler is still written in Ruby, the "inko" executable will
      spawn a sub-process to run the compiler. The VM is run in the same
      process as the CLI.
      To make developing the CLI a bit easier, we now use Cargo workspaces to
      separate the VM (= libinko) and the CLI.
  3. 21 Sep, 2020 2 commits
    • Yorick Peterse's avatar
      Allow using of the system libffi installation · b7011125
      Yorick Peterse authored
      This adds the feature flag "libffi-system" to the VM. When this flag is
      enabled, libffi is dynamically linked instead of statically linked. This
      allows the VM to use the system libffi installation.
      This feature is useful when building packages where static linking to
      libraries already provided is not desired.
    • Yorick Peterse's avatar
      Set LLVM_CONFIG_PATH for macOS builds · a54bbed7
      Yorick Peterse authored
      Without this libffi builds may fail. How exactly they passed before is
      unclear, perhaps due to the builds being cached.
  4. 09 Sep, 2020 1 commit
  5. 24 Aug, 2020 1 commit
  6. 16 Aug, 2020 3 commits
  7. 15 Aug, 2020 6 commits
  8. 14 Aug, 2020 5 commits
    • Yorick Peterse's avatar
      Give tracer threads a name · 9e36965f
      Yorick Peterse authored
      This makes the output of `info threads` in GDB a little less confusing.
    • Yorick Peterse's avatar
      Relocate the allocate_stacktrace() function · b23d4cbb
      Yorick Peterse authored
      Since this function is only used by the process instruction functions,
      we can just move it into the same module.
    • Yorick Peterse's avatar
      Fix a NoMethodError when there are diagnostics · bf1c5d39
      Yorick Peterse authored
      The compiler producing errors or warnings could result in a
      NoMethodError, as we didn't take into account there is no module to
      store in this case.
    • Yorick Peterse's avatar
      Don't manually install rust for license scanning · 7a8ba31a
      Yorick Peterse authored
      This is now included properly, so this is no longer necessary. Sadly PHP
      is installed from source every time, but this is a bug we can't reliably
      solve on our end.
    • Yorick Peterse's avatar
      Replace bytecode files with a bytecode image · 5268c5e0
      Yorick Peterse authored
      In this commit we introduce the concept of a bytecode image: a
      collection of all the bytecode modules that belong to a program. These
      images are parsed at startup, using multiple threads to speed up the
      parsing process. We also drop support for loading modules at runtime,
      specifying bytecode directories to search, and all functionality related
      to runtime module loading.
      With these changes, compiled code is reduced to a single bytecode file.
      The format of this file is such that it can be easily parsed in
      parallel, reducing the startup time. For example, the bytecode image for
      Inko's standard library tests (and all dependencies) only takes about
      8-9 milliseconds to parse.
      These changes remove the need for and performance impact of runtime
      module loading. This also reduces the startup time of the VM. In case of
      Inko's test suite this reduction is about 20 milliseconds. These changes
      also make distribution of programs easier, as you only need to
      distribute a single bytecode file. This should also make it easier to
      bootstrap the self-hosting compiler in the future, as we could for
      example embed the image in a compiled executable; removing the need for
      finding it somewhere on the file system.
      This fixes #43
  9. 10 Aug, 2020 2 commits
    • Yorick Peterse's avatar
      Add support for literal indexes beyond than 65 535 · 6930068a
      Yorick Peterse authored
      The SetLiteral instruction can only be used to access a maximum of 65
      535 literals. That many literals in a single block is unlikely. However,
      when we start storing literals per module (instead of per block),
      reaching this limit is less likely.
      In this commit we introduce the SetLiteralWide instruction. This
      instruction can address up to 42 949 67 295 literals per module; which
      should be more than enough. This instruction works by storing the higher
      16 bits as the second argument, and the lower 16 bits as the third
      SetLiteralWide is a separate instruction so the additional overhead of
      bitwise operations doesn't affect code that doesn't need to address this
      many literals.
    • Yorick Peterse's avatar
      Store the current module in a ExecutionContext · 133600aa
      Yorick Peterse authored
      This allows us to move literals from CompiledCode objects into Module
      objects. This in turn cuts down the amount of literal duplication in
  10. 09 Aug, 2020 1 commit
    • Yorick Peterse's avatar
      Reuse thread pools for GC tracer threads · 34ca7972
      Yorick Peterse authored
      The pools used for tracing objects are now created when spawning GC
      coordinators, instead of spawning threads for every garbage collection
      cycle. Spawning threads can take between 10 and 20 microseconds, while
      creating the pool data structures itself can take around 100
      microseconds. On systems with many threads (e.g. 32), this can easily
      lead to a GC coordinator spending at least 500 microseconds just setting
      everything up.
      The implementation is a bit unique. Each tracer pool as a "Broadcast"
      type that can be used to wake up the tracer threads. When woken up, they
      receive the process (and some other data) to trace. The last thread
      receiving the value clears it. This setup means that waking up threads
      is a constant-time operation, taking only about 4 microseconds.
      Initially I used a channel per thread, but this requires 3-4
      microseconds _per thread_ to wake them up.
      In addition to these changes, the number of tracer threads now (once
      again) equals the number of CPU cores; instead of being limited to half
      the number of cores. With the new pool setup in place I did some
      testing, and I found that on an 8-core machine the GC performs _better_
      when using 8 cores for tracing, instead of only using 4 cores.
      This fixes #191
  11. 05 Aug, 2020 2 commits
    • Yorick Peterse's avatar
      Use aHash instead of SipHasher and FNV · 0b55d1e2
      Yorick Peterse authored
      aHash is supposed to be quite a bit faster than both, while still being
      DOS resistant. It also requires less memory compared to SipHasher: only
      32 bytes instead of 72 bytes. This is important, because it ensures that
      all object values (ignoring any indirection they currently use) are
      either 8B, 24B, or 32B; previously the Hasher value would be the only
      one with a size of 72B. This simplifies future allocator changes, as
      there are fewer object sizes we'd have to deal with.
      Since aHash does not support obtaining the keys used, resetting hashers
      is no longer possible; at least not without storing the keys ourselves.
      As this is not terribly useful in the first place (you can just create a
      new hasher), support for resetting hashers is removed.
      This fixes #179
    • Yorick Peterse's avatar
      Don't store keys in a Hasher · 76276999
      Yorick Peterse authored
      The underlying SipHasher type already stores this, and offers the keys()
      method to obtain the keys. This reduces the Hasher type-size from 88
      bytes to 72 bytes.
  12. 04 Aug, 2020 1 commit
  13. 27 Jul, 2020 1 commit
  14. 15 Jul, 2020 1 commit
    • Yorick Peterse's avatar
      Refactor various aspects of VM instructions · 0619de82
      Yorick Peterse authored
      == Instruction memory layout
      VM instructions now have a fixed size of 16 bytes, and no longer make
      use of a separate heap-allocated Vec for their arguments. This reduces
      memory usage, and should make for more cache-friendly instruction
      Each instruction is limited to six arguments, which is enough for all
      existing instructions. Instructions that need a variable number of
      arguments, such as SetArray, make use of register ranges. Instead of
      specifying all registers, they specify the first one and a length. The
      compiler in turn makes sure all argument registers are in a contiguous
      order. This approach is also taken by Lua, though unlike Lua we don't
      require the arguments to come after the register containing the block
      to run.
      Some instructions supported optional arguments, such as SetObject. These
      instructions have been modified to simply always require an argument.
      This simplifies the VM code, and in almost all cases the arguments were
      always specified anyway.
      For Inko's test suite, these changes reduce peak RSS usage from 27 MB
      down to 21 MB.
      == MoveResult instruction
      Values returned and thrown are handled differently. Instead of each
      ExecutionContext storing the register (of the parent frame) to write
      their result to, operations that return or throw a value now store the
      value in a per-process "result" variable. The MoveResult instruction
      moves this value into a register, setting the "result" variable to NULL.
      This approach is inspired by the Dalvik VM, and simplifies instructions
      such as Return and Throw.
      == Single instruction for pinning processes
      The instructions ProcessPinThread and ProcessUnpinThread have been
      merged into a single ProcessSetPinned instruction. This instruction
      behaves similar to ProcessSetBlocking.
      == Removed instructions
      The following VM instructions have been removed as they were not used:
      * SetPrototype
      * RemoveAttribute
      * BlockSetReceiver
      == Refactoring of instruction handlers
      The functions used for handling instructions have been refactored,
      renamed (after their instructions), and are now always inlined. This
      looks a bit funny at the moment, but it should make it easier for a
      future JIT to reuse these functions. The renaming also allows one to
      import specific functions, without having to worry about generic names
      such as "get" conflicting with other functions.
      == Bytecode parser cleanup
      The bytecode parser has been cleaned up a bit, and now limits various
      data sequences to the maximum u16 value; instead of some arbitrarily
      determined limit. The u16::MAX limit ensures that registers can address
      the values directly.
      == Argument changes
      The VM no longer supports keyword arguments and rest arguments. Instead,
      the compiler takes care of translating these to positional arguments.
      Keyword arguments are translated to positional arguments, with
      unspecified arguments being passed NULL pointers. The way this works is
      1. Create an array with a NULL value/pointer for every _expected_
         argument. This is achieved by reserving register 0 and using that.
      2. For every positional argument passed, fill its corresponding cell. So
         argument 1 fills cell 0, argument 2 fills cell 1, etc.
      3. For keyword arguments, look up its argument position and set the
         corresponding cell.
      4. Pass this array as arguments to the VM instruction.
      This is best illustrated with a simple example:
          def foo(a = 1, b = 2, c = 3) {}
          foo(b: 10)
      Here the arguments passed would be:
          [NULL, 10, NULL]
      The VM then checks if `b` and `c` are set, sees they are NULL, and
      assigns them their default values.
      Validating of argument counts is also removed from the VM, now that
      dynamic method calls are no longer supported.
  15. 06 Jul, 2020 3 commits
  16. 05 Jul, 2020 2 commits
    • Yorick Peterse's avatar
      Remove various methods from the VM Process type · 0fdaf451
      Yorick Peterse authored
      These methods can/should be used via a ExecutionContext.
    • Yorick Peterse's avatar
      Don't store line numbers in ExecutionContext · d78ebee2
      Yorick Peterse authored
      Line numbers can be obtained from the instructions, based on the current
      instruction index. This doesn't reduce memory usage due to padding, but
      it simplifies some of the instruction handling logic.
      Initially I made an attempt to change the setup for line numbers
      entirely: instead of storing absolute lines per instruction, I
      implemented a line offset table similar to the one used in Python.
      This ended up reducing the size of bytecode files by about 5%, at the
      cost of increasing memory usage by about 5%. Due to the added complexity
      of said setup, I decided to not make use of it at this time.
  17. 01 Jul, 2020 2 commits
  18. 22 Jun, 2020 1 commit
    • Yorick Peterse's avatar
      Update various VM dependencies · e62a4f8c
      Yorick Peterse authored
      This requires a few small changes here and there for new APIs introduced
      by some crates. We also replace "dirs" with "dirs-next", as the "dirs"
      project has been archived.
  19. 20 Jun, 2020 1 commit
    • Yorick Peterse's avatar
      Don't clone when finding binding parents · 5276839b
      Yorick Peterse authored
      Cloning bindings when traversing their ancestors is expensive when done
      often enough. Using references is an easy way to work around this, at
      the cost of requiring two methods: Binding::find_parent() and
  20. 18 Jun, 2020 2 commits
    • Yorick Peterse's avatar
      Fix generating of release tarballs · baf6d686
      Yorick Peterse authored
      Generating source archives for releases was broken, and apparently for
      quite a while as well. This should fix that.
    • Yorick Peterse's avatar
      Fix stdout.print/stderr.print panicking · 4868f9a9
      Yorick Peterse authored
      stdout.print and stderr.print perform two writes: one for the message,
      and one for a newline. The newline write was not wrapped in a "try"
      expression, causing any exceptions thrown to go unnoticed.
      Sending messages to optional types inside a "try" was also broken. When
      doing so, the VM would jump to the incorrect instruction and effectively
      ignore the "else" clause; bubbling up the error instead. This is fixed
      by having the compiler generate different code: instead of jumping to
      the next block (whatever that may be), we jump to a specific block.
      As part of this the signature of stdout.print/stderr.print is also
      changed. Instead of the argument being typed as `?ToString`, it's now a
      `ToString` with an empty String as the default value.
      This fixes #199
  21. 11 Jun, 2020 1 commit