Skip to content

[#600] Add fixed-size lists to Morley

Nikolay Yakimov requested to merge lierdakil/#600-sized-lists into master

Description

Problem: There are some use-cases (and more are popping up) that could benefit from the use of homogeneous fixed-size lists.

Solution: add a simple fixed-size lists module Morley.Utils.SizedList. Use those where appropriate.

I've used SizedList in a couple places mentioned in #600 (closed), but pretty sure there are more. A cursory look at the code didn't reveal any other opportunities, so at this point I figured it's easier to ask for pointers. So if anyone has an idea where it could be useful, please share.

Some design notes:

I've played around with the idea of making SizedList a wrapper around regular lists, but this approach has some drawbacks: namely, patterns would have to rely on comparisons between naturals, which the compiler can't always deduce without additional help.

So instead, I've opted for a very cookie-cutter GADT Peano-based approach to constructing the list type. The drawback here is that conversion to/from regular lists is O(n) instead of O(1), but arguably, since we're not intending to use this for particularly large lists, this isn't a big issue.

With most utility functions, I've opted to implement those using a verifiable approach: that is, I try to avoid using "unsafe" functions. The only real exception here is reverse, which also can be constructed verifiably, but at the cost of complexity, and to a human it's rather obvious that the current implementation (through conversion to a regular list) is safe.

To achieve this, I've had to add some inductive "proofs" to "Morely.Util.Peano". AFAIU, Refl has the same runtime representation regardless of what it signifies, so hopefully the optimizer will be smart enough to drop those in the compiled code. Otherwise, since proofs are inductive, they would incur an additional O(n) runtime cost. If that is the case, using unsafeCoerce to simply assert the theorem would be less costly, but that's not verified by the compiler, so can eventually lead to inconsistent behaviour. After some simple tests, I'm under the impression GHC isn't smart enough to throw these proofs away with -O1 at least, so I've added rewrite rules to proofs. This should be safe, since all these proofs only exist on the type level, and their runtime representation is mostly ignored (and it should be the same anyway). We could also just assert the consequent at use sites, but as I said, the compiler won't verify if those assertions are true, and we might get some inconsistencies at runtime if we're not careful (well, the only place where it matters currently is in generate')

I've tried to implement a minimal subset of the typical menagerie of list utility functions, with some exceptions. For instance, takeWhile etc don't really work well with sized lists, since we don't know the resulting length at compile-time.

Functions that take explicit length and/or index, for sized lists instead take a type-level natural as a type argument. In most cases there are alternative versions that accept a SingNat, which is a Peano singleton (mostly useful when there is already a singleton present)

The Applicative and Monad instances are somewhat unintuitive. Applicative is a "zipper", i.e. pure is replicate and <*> is zipWith ($) -- this satisfies the Applicative laws (arguably for sized lists it's the only meaningful instance that satisfies those). Monad is even stranger: basically, it's isomorphic to the Reader monad, where "environment" is element index. This instance also satisfies the Monad laws, obviously, and is consistent with "zipper" Applicative, but it's debatable whether it's particularly useful. At least, zipWithN can be implemented using the do notation (although it's somewhat inefficient)

A bit unclear about the tests situation here: I can obviously add tests (arguably those would be somewhat redundant though), but where? Cleveland seems a rather strange place to add tests for a utility module, and morley doesn't have its own test-suite. Perhaps doctests are good enough.

Some concerns:

I'm using rewrite rules with unsafeCoerce to sidestep some computational complexity wrt formal verification. I have to admit I'm far from being an expert on GHC's internals, so while I'm reasonably certain this shouldn't blow up, I can't guarantee that. Someone with decent knowledge of this stuff should probably take a look. This still should be safer than throwing away type information and asserting that everything's fine.

index and index' work well enough when all length and indices are known a priori, however those run into the type-level natural comparison quirks the moment something goes off the "golden path". TL;DR, compiler sometimes might need some help deducing that if n > m and m > k then n > k. For that, I've added the transitivity proof to Morley.Util.Peano, but ergonomics aren't particularly great. Perhaps indexing with a finite Peano GADT would be easier.

generate' uses a pair of (SingNat n, Natural) as generator argument. Not sure if using PeanoNatural (which is isomorphic) here would be more or less intuitive.

Related issue(s)

Resolves #600 (closed)

Checklist for your Merge Request

Related changes (conditional)

  • Tests (see short guidelines)

    • If I added new functionality, I added tests covering it.
    • If I fixed a bug, I added a regression test to prevent the bug from silently reappearing again.
  • Documentation

    • I checked whether I should update the docs and did so if necessary:
    • I updated changelog files of all affected packages released to Hackage if my changes are externally visible.

Holding off on changelog update until review.

Stylistic guide (mandatory)

Edited by Nikolay Yakimov

Merge request reports