1. 13 Jun, 2022 1 commit
  2. 12 Jun, 2022 1 commit
  3. 11 Jun, 2022 3 commits
    • Scott Hamper's avatar
      Replaced `var` with explicit data types except when calling a constructor. · 2cedb5d0
      Scott Hamper authored
      In most situations, it's more clear to specify the data type rather than use `var`. The primary value of using `var` is when the variable is initialized to a new instance of a class, as the data type is already abundantly clear from the constructor, and so using `var` saves space. Beyond this situation, IntelliJ will show an inlay hint of the data type, meaning that the keyword doesn't really save any space - it only obfuscates the type in less sophisticated editors.
      2cedb5d0
    • Scott Hamper's avatar
      Simplified .gitignore. · 9e0d2225
      Scott Hamper authored
      The virtual machine crash logs have a .log extension, so it's unnecessary to have a separate rule for them.
      9e0d2225
    • Scott Hamper's avatar
      Simplified .gitignore. · 605d2712
      Scott Hamper authored
      Most significantly, started ignoring the entire .idea directory. Given that this is a pure Gradle project, none of the .idea files need to be in version control.
      605d2712
  4. 09 Jun, 2022 2 commits
  5. 08 Jun, 2022 14 commits
    • Scott Hamper's avatar
      Updated Gradle configuration for required Java version. · 8ceac1ed
      Scott Hamper authored
      My current understanding is that while `targetCompatibility` defaults to `sourceCompatibility`, it's possible for the compiled classes to still depend on newer JDK libraries if they were compiled by a newer JDK version. Setting the toolchain, in contrast, ensures that the compiled output will actually run on the specified JDK.
      8ceac1ed
    • Scott Hamper's avatar
      Added license. · add6c2a5
      Scott Hamper authored
      add6c2a5
    • Scott Hamper's avatar
      3ce09e81
    • Scott Hamper's avatar
      7e186e11
    • Scott Hamper's avatar
      Added a factorial operator for PrattParser.INT. · 92fb9eaf
      Scott Hamper authored
      I'm leaning towards getting rid of the static `INT` instances for both parsers, and requiring clients (like the REPL) to construct their own parsers. Factorial is a method that's complicated enough to warrant unit testing (e.g., making sure `0!` is `1`, throwing for negative operands), and so I don't want to define it twice for both parsers' `INT` instance. This means pulling the method out intro a central location and passing it as a method reference to a parser builder. It seems more appropriate for that central location to be in the REPL than to be in the core mxparse package. The parsers themselves don't care about the correctness of a particular function - they just care about parsing a particular grammar.
      92fb9eaf
    • Scott Hamper's avatar
      dc05235f
    • Scott Hamper's avatar
      Deleted expression package. · 0aa47933
      Scott Hamper authored
      The expression package was maybe useful when there were way more files in the project (e.g., a separate class for each operator), but doesn't seem like it provides any value now. I don't think I've ever really gotten fully used to package organization in Java - I miss how C# does things, where sub-folders can exist as an organizational tool without creating a new namespace. It's nice to be able to see the `Expression` interface and its implementations grouped together, but I don't actually want them in a different package.
      0aa47933
    • Scott Hamper's avatar
      Small readability tweaks. · 0a4629bc
      Scott Hamper authored
      0a4629bc
    • Scott Hamper's avatar
      Created a PostfixParser builder. · 07ee61e3
      Scott Hamper authored
      I really liked how the builder turned out for PrattParser. At first, I was salty that a builder only seemed necessary to compensate for Java not having named/default method parameters, but it ended up making constructing instances so much easier. The benefits are less pronounced with the PostfixParser since we're not doing nutty mutually recursive sub-parsing that the builder methods can help set up, but it's still a decent improvement over constructing maps/a set for the fields.
      
      Also renamed some fields to be more consistent with the choices I made in PrattParser.
      07ee61e3
    • Scott Hamper's avatar
      Added a non-null assertion to TokenIterator's `current` method. · 9f263a04
      Scott Hamper authored
      Seems appropriate, since `next` is also expected to throw in a similar situation.
      9f263a04
    • Scott Hamper's avatar
      Added a caveat for my "type aliases". · 7e46da83
      Scott Hamper authored
      7e46da83
    • Scott Hamper's avatar
      Refactored `nextOrNull` to behave as the name suggests it should. · 060e0cc0
      Scott Hamper authored
      The name suggests that it works like `next`, except returns `null` when there is nothing left to iterate over. However, it was only advancing the lookahead token, and did not advance the current token.
      060e0cc0
    • Scott Hamper's avatar
      Removed "Operator" from PrattParser Builder method names. · 579553b0
      Scott Hamper authored
      The main motivation was to reduce visual noise when chaining builder methods, but the old names also weren't really accurate. Not everything that acts on values is an operator - for example, functions also act on values, and can be registered as a prefix token.
      579553b0
    • Scott Hamper's avatar
      Removed FIXME comment. · 46ae3ecb
      Scott Hamper authored
      I've grown to like that the `parse` method takes in an iterable, as it allows for lazy evaluation and for a wide range of collection types to be passed in. I'm also less concerned with specifying an initial capacity for the ArrayDeque - the default value in the standard library is currently 16, and I suspect the majority of real use cases will not end up having 16+ consecutive operand tokens. Instead, naturally written postfix expressions will include operators after most operands, and the deque will have values popped off each time an operator is encountered. As such, it is rare that the deque's backing array will need to be "expanded".
      46ae3ecb
  6. 07 Jun, 2022 2 commits
    • Scott Hamper's avatar
      Created MIN_BINDING_POWER constant. · 7f919a66
      Scott Hamper authored
      Also added some assertions to make sure that valid binding powers are passed into various PrattParser/Builder methods.
      7f919a66
    • Scott Hamper's avatar
      Created TokenIterator nested class. · e757ed29
      Scott Hamper authored
      While using a `Deque<Token>` in `PrattParser` was somewhat reasonable (it has convenient peek methods), it had a couple of issues:
      - The parser could not evaluate tokens lazily. It had to iterate over all tokens up front in order to construct the deque.
      - While deques have many convenient methods, they still lack some that would help simplify the parser implementation. For example, being able to reference the most recently retrieved/removed token, and being able to consume the next token while asserting it has a specific type.
      
      Additionally, the previous PrattParser implementation only skipped over ignored token types in the `parse` overload that takes an iterable. This was problematic because the overload that accepts a deque was public, and it had to be public in order for custom sub-parsers created outside of the library to recursively call it. Both `parse` overloads should correctly skip over ignored token types if they're both going to be public.
      
      The `TokenIterator` nested class solves all of these issues by decorating an existing iterator with additional methods. By being a decorator, no changes need to be made to the lexer - the lexer can still return an ordinary `Iterable<Token>`. This is great, since the postfix parser also wants to be able to use the lexer's output, and has no need for the extra behavior implemented in `TokenIterator`.
      e757ed29
  7. 06 Jun, 2022 1 commit
    • Scott Hamper's avatar
      Fixed regex issue in the lexer. · 914ed523
      Scott Hamper authored
      Previously, the lexer would create a single regex pattern that contained individual capture groups for each token type. In order to figure out what type had been matched against, the pattern named each capture group after the token type's name. However, regex capture group names can only contain alphanumeric characters, while token type names have no such constraints. I've changed the implementation so that there is a separate regex pattern for each token type. This removes the need to use capture groups and resolves the issue.
      
      At some point, it would be interesting to do some benchmarking to find the best approach. This new implementation ends up creating as many Matcher instances as there are token types, while the previous implementation only created a single Matcher instance for a given input. For now, I am satisfied that the new implementation is simpler than the original.
      914ed523
  8. 05 Jun, 2022 8 commits
    • Scott Hamper's avatar
      Made parse method overload public. · 6a231a56
      Scott Hamper authored
      Custom sub-parsers registered via `withHead` and `withTail` may need to call the `parse` overload that accepts a right binding power.
      6a231a56
    • Scott Hamper's avatar
      Changed initial right binding power to zero. · bf848e06
      Scott Hamper authored
      In the future, it should be asserted that binding powers are greater than or equal to zero. This will ensure that the `withConsumed` method will work as intended, as it needs to associate a token type with the lowest possible binding power.
      bf848e06
    • Scott Hamper's avatar
      Adjusted PrattParser fields. · 7d894aa4
      Scott Hamper authored
      Previously, it was awkward to register the closing paren token type with a PrattParser instance. Rather than  having its own sub-parser, a closing paren gets consumed via the opening paren's sub-parser. However, it's still necessary to associate a low left binding power with the closing paren so that the paren marks the end of the expression to its left. To better support this and other tokens like it (e.g., a comma token for delimiting function parameters), I've separated the sub-parser functions from binding powers.
      
      I also added a few more builder methods to allow for registering arbitrary head/tail/consumed token types that don't fall into one of the pre-existing literal/operator/grouping categories. The pre-existing categories now use the new builder methods in their implementation.
      
      Finally, I realized that the `Function3` interface I had been using was from Kotlin, and not one from the Java standard library. I don't want the main project to have a dependency on Kotlin libraries, so I've added my own `Function3` interface.
      7d894aa4
    • Scott Hamper's avatar
      Deleted Associativity enum. · 59b32560
      Scott Hamper authored
      59b32560
    • Scott Hamper's avatar
      Renamed built-in token types. · 7e297352
      Scott Hamper authored
      Mostly shortened names to be less verbose.
      7e297352
    • Scott Hamper's avatar
      Added fluent builder API for constructing PrattParsers. · bafb4edf
      Scott Hamper authored
      This makes it much easier to create parser instances, and makes it possible to easily see the entire grammar of the built parser in a single place.
      bafb4edf
    • Scott Hamper's avatar
      Renamed InfixParser to PrattParser. · dbcbf70b
      Scott Hamper authored
      dbcbf70b
    • Scott Hamper's avatar
      Refactored InfixParser. · 61264697
      Scott Hamper authored
      This continues the work of un-static-fying things and allows other code to create new/distinct parsers. I've also added a generic type parameter to the InfixParser class, which allows for creating parsers that parse different types of expressions.
      
      Still much refactoring left to do to clean up the API and unit tests.
      61264697
  9. 02 Jun, 2022 4 commits
    • Scott Hamper's avatar
      Refactored operator classes/interfaces. · da381c1c
      Scott Hamper authored
      The Expression classes and PostfixParser don't care about associativity and precedence, so it was weird to have them use operator instances that required specifying that information. It also made creating a mapping between token types and operators for a PostfixParser instance more cumbersome. Now, these classes simply work with Function and BiFunction instances.
      
      The UnaryOperator and BinaryOperator interfaces have been turned into record classes, which makes it easier to create new types of operators. It also allows for grouping a built-in set of operators into a single file instead of spreading them out across many concrete classes. It's now much easier to see the precedence hierarchy of the built-in operators.
      da381c1c
    • Scott Hamper's avatar
      Initial refactoring of PostfixParserTests. · 5e987423
      Scott Hamper authored
      Tests now create and use their own PostfixParser instance and focus on general parsing logic, rather than being tied to the built-in token types. There's still some awkwardness with mapping token types to operators that should be addressed in a later commit.
      5e987423
    • Scott Hamper's avatar
      Refactored PostfixParser. · 9e03962c
      Scott Hamper authored
      This continues the work of un-static-fying things and allows other code to create new/distinct parsers. I've also added a generic type parameter to the PostfixParser class, which allows for creating parsers that parse different types of expressions.
      
      Unit tests for PostfixParser can and should be reworked to no longer exhaustively test all the built-in token types. They should be concerned with the general parsing logic.
      9e03962c
    • Scott Hamper's avatar
      Refactored Token.Type into a record class. · 1c6ca86c
      Scott Hamper authored
      This continues the work of un-static-fying things and allows other code to create new token types instead of being restricted to the built-in types.
      1c6ca86c
  10. 01 Jun, 2022 2 commits
    • Scott Hamper's avatar
      Refactored tokenize into an instance method. · 69ad5711
      Scott Hamper authored
      This change allows individual Lexer instances to tokenize different grammars by configuring the mapping between token types and regular expressions at construction time. By pulling regular expressions out of Token.Type, we also end up with a more appropriate model - the Lexer is the only component that is concerned with regular expressions. This is the first change of many to move back towards not having everything be static.
      
      Back when this project was a hastily-created demo for my comp sci students, I had started with the Lexer and Parsers being non-static. In that implementation, I had token types and operators get registered *after* construction. However, due to my quick/messy implementation, there was some awkward code duplication between the lexer and parsers around what strings/characters mapped to what. I decided to move towards things being static to remove that duplication and because I thought it would be easier for my students to follow what the code was doing. Now, I have some ideas for how to keep things non-static while still being convenient to use and without unnecessary duplication.
      69ad5711
    • Scott Hamper's avatar
      Simplified generic type parameters. · e8fadbe9
      Scott Hamper authored
      Having separate generic types for a composite expression's components (operand(s)/return type) is a nice idea, but not something I'm going to actually use. I originally thought maybe I'd support integer math (like division) mixed in with floating point math, having the parser do implicit type conversions when appropriate. However, I'm going to move towards having all calculations being done as floating point math. The goal of this project is not to model how Java evaluates mathematical expressions, but to model a "normal" calculator. Additionally, I'm not really sure how having separate generic type parameters for each component of an expression would have helped support mixed typed expression parsing - I would expect to need to use copious amounts of reflection, but Java has type erasure, sooo...
      
      I'll still leave a single generic type parameter for expressions/operators, rather than hard-coding a type. I like that it leaves those classes/interfaces flexible to be used in other contexts. If I decide I want to switch to a different number type, there's fewer places to change. If I want to build a boolean expression parser, the generic interfaces could be reused. These scenarios are probably not likely or frequent, but leaving a generic type parameter in place adds negligible verbosity to the code.
      e8fadbe9
  11. 30 May, 2022 2 commits