Bidirectional Type Checking V2 Part I 🏎 (!2103) · Merge requests · ligolang / LIGO

Motivation and Context

The aim of this series of MR's is to improve the bidirectional type checkers performance incrementally by moving towards a union-find representation of types.

There are two initial obstacles with this goal.

Firstly, the the bidirectional type checker is written as a "direct-style" state monadic computation, with the following function signature
```
  val type_checking_function 
   :  <function specific args> 
   -> raise:('err, 'wrn) raise 
   -> options:Compiler_options.middle_end
   -> ctx:Context.t
   -> Context.t * <function specific returns>
```
This directly reflected the existing formalisation of LIGO algorithmic type system, but makes the type checker extremely verbose. Moreover it prevents us from abstracting over the implementation of unification/subtyping which will (eventually) no-longer depend on Context.t but mutable side-effects.
The type checker currently uses the tree-based Ast_typed.type_expressions internally to encode types.

Description

This MR aims to refactor the entire type checker using "computations". Computations are defined by the following domain-specific language:

(** [('a, 'err, 'wrn) t] computation returns a value of type ['a], potentially 
    raising errors and warnings of type ['err] and ['wrn]. *)
type ('a, 'err, 'wrn) t

include Monad.S3 with type ('a, 'err, 'wrn) t := ('a, 'err, 'wrn) t

(** {1 Error Handling} *)
val raise : 'err Errors.with_loc -> ('a, 'err, 'wrn) t
val try_ : ('a, 'err, 'wrn) t -> with_:('err -> ('a, 'err, 'wrn) t) -> ('a, 'err, 'wrn) t
...

(** {2 Location Handling} *)

(** [loc ()] returns the current location *)
val loc : unit -> (Location.t, 'err, 'wrn) t
(** [set_loc loc t] sets the current location to [loc] in computation [t] *)
val set_loc : Location.t -> ('a, 'err, 'wrn) t -> ('a, 'err, 'wrn) t

(** {3 Compiler Options} *)
val options : unit -> (Compiler_options.middle_end, 'err, 'wrn) t

module Options : sig
  val test : unit -> (bool, 'err, 'wrn) t
  val syntax : unit -> (Syntax_types.t option, 'err, 'wrn) t
end

(** {4 Context} *)

(** ['a exit] defines the behavior when exiting a context scope, returning
    a computation of type ['a] *) 
type _ exit =
  | Drop : 'a exit (** [Drop] simply drops all context items up until the scope marker *)
  | Lift_type : (Type.t * 'a) exit (** [Lift_type] drops everything up until the scope marker *except* existential variables and applies any dropped equations to the returned type. *)
  | Lift_sig : (Context.Signature.t * 'a) exit (** Similar to [Lift_type] but for signatures. *)

module Context : sig
  (** Context lookup functions *)

  val get_value
    :  Value_var.t
    -> ( ( Context.mutable_flag * Type.t
         , [ `Mut_var_captured | `Not_found ] )
         result
       , 'err
       , 'wrn )
       t

  val get_value_exn
    :  Value_var.t
    -> error:([ `Mut_var_captured | `Not_found ] -> 'err Errors.with_loc)
    -> (Context.mutable_flag * Type.t, 'err, 'wrn) t
  
  ...

  (** [lock ~on_exit ~in_] adds a local fitch-style lock for mutable variables to 
      the context in [in_]. *)
  val lock : on_exit:'a exit -> in_:('a, 'err, 'wrn) t -> ('a, 'err, 'wrn) t


  (** [tapply type_] applies the context substitution to [type_] *)
  val tapply : Type.t -> (Type.t, 'err, 'wrn) t
end

(** [exists kind] creates a new existential variable of kind [kind] *)
val exists : Kind.t -> (Type.t, 'err, 'wrn) t

(** [for_all kind] creates a new universal variable of kind [kind] *)
val for_all : Kind.t -> (Type.t, 'err, 'wrn) t

(** [lexists ()] create a new existential layout variable *)
val lexists : unit -> (Type.layout, 'err, 'wrn) t

(** [create_type constr] returns a created type using the [constr] function
    with the location automatically provided *)
val create_type : Type.constr -> (Type.t, 'err, 'wrn) t

(** [def bindings ~on_exit ~in_] binds the context bindings [bindings] in 
    computation [in_] *)
val def
  :  (Value_var.t * Param.mutable_flag * Type.t) list
  -> on_exit:'a exit
  -> in_:('a, 'err, 'wrn) t
  -> ('a, 'err, 'wrn) t

...

(** [generalize type_and_res] generalizes the type in [type_and_res], 
    returning the generalized type, the quantified variables (and their kinds)
    and the result. *)
val generalize
  :  (Type.t * 'a, 'err, 'wrn) t
  -> (Type.t * (Type_var.t * Kind.t) list * 'a, 'err, 'wrn) t

(** {5 Unification and Subtyping} *)

type unify_error = [ ... ]

(** [unify type1 type2] unifies types [type1] and [type2] *)
val unify : Type.t -> Type.t -> (unit, [> unify_error ], 'wrn) t

type subtype_error = [ ... ]

(** [subtype ~received ~expected] ensures [received] is the subtype of [expected] 
    and returns a coercion [f] that accepts an expression of type [received] and 
    elaborates to an expression of type [expected]. *)
val subtype
  :  received:Type.t
  -> expected:Type.t
  -> ( Ast_typed.expression -> Ast_typed.expression Elaboration.t
     , [> subtype_error ]
     , 'wrn )
     t

This DSL aims to abstract over the "direct-style" approach previously used in the type checker. The underlying implementation of this DSL will be swapped out with a union-find implementation (providing the desired performance improvements).

The second change in this MR is the introduction of a local type representation in the type.ml module and an elaboration module which provides the means of "decoding" these types into Ast_typed.type_expressions.

(* Elaboration.ml *)
type 'a t

include Monad.S with type 'a t := 'a t

val all_lmap : 'a t Record.LMap.t -> 'a Record.LMap.t t
val all_lmap_unit : unit t Record.LMap.t -> unit t

include module type of Let_syntax

val decode : Type.t -> Ast_typed.type_expression t

An example usage of this API is as follows:

  let open Computation in
  let open Let_syntax in
  ...
  let%bind loc = loc () in
  return
    ( type_
    , Elaboration.(
        (* [content] has type [O.expression_content Elaboration.t] *)
        let%bind content = content in
        (* decode the [type_ : Type.t] into an [O.type_expression] *)
        let%bind type_ = decode type_ in
        (* create the output expression containing the decoded type *)
        return @@ O.make_e ~location:loc content type_) )

The crucial benefit of this DSL is that we provide a description of elaboration and inference/checking in the same place and they provide sufficient abstractions that allow us to use better data structures for type checking without changing the AST types.

Types of changes

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
Performance improvement (non-breaking change that improves performance)
None (change with no changelog)

Checklist:

Changes follow the existing coding style (use dune @fmt to check).
Tests for the changes have been added (for bug fixes / feature).
Documentation has been updated.
Changelog description has been added (if appropriate).
Examples in changed behaviour have been added to the changelog (for breaking change / feature).

Edited Nov 22, 2022 by Alistair O'Brien

Bidirectional Type Checking V2 Part I 🏎

Motivation and Context

Description

Types of changes

Checklist:

Merge request reports