Skip to content
Tags give the ability to mark specific points in history as being important
  • misc.summary
    Set up basic project structure and metadata
    
    Welcome to the semver-parser tutorial!
    
    *This tutorial is an example of a "literate git" project.  For an
    explanation of the motivations for this project, view the report
    [here][3].*
    
    Our goal is to build a Rust library that can parse [semver][0] version
    strings.  The maintainer of the [semver crate][1] has decided to split
    off the parsing operations to a separate crate, and he assigned us to
    implement it.
    
    We'll document both the steps we take to build the crate and the reasons
    (or lack thereof) behind our implementation choices.  When someone comes
    along later with an idea for improving our program, they can read our
    tutorial for more information about the project's structure and history.
    
    There are two main public functions we need to expose to be able to plug
    in to the semver crate.  The `version::parse` function transforms a
    version string into a struct with individual fields for each part of the
    version number.  The `range::parse` function takes a string that
    specifies a range of matching versions and returns a struct containing
    one or more predicates which can be compared to individual versions.
    
    Lucky for us, there is already a large set of test cases that we can use
    to verify our results.  Since the function names and return types are
    already mostly taken care of, we can just focus on passing those tests.
    Ah, the life of a junior programmer!
    
    Before we start putting together our functions, let's do some basic
    setup. We're going to use the [Cargo][2] package manager and build
    system to help streamline development, since the semver crate is already
    using it.
    
    Check out the initial project structure below, then click the arrow
    above to move on to the next step.
    
    .gitignore
    ----------
    
    Cargo automatically creates a basic .gitignore file for us.  The target
    directory is where build artifacts land, so we don't want to be tracking
    that.  Because this is a library, we also ignore the Cargo.lock file.
    If we were building an executable application, we would not ignore it.
    
    Cargo.toml
    ----------
    
    The Cargo.toml file contains metadata about our crate.  This includes
    links to related resources, dependency lists, license information, and
    many other possible fields.  Since we have great foresight, we'll just
    fill this all in right now and forget about it.  If we didn't have a
    crystal ball, we'd be updating this along the way.
    
    [0]: http://semver.org/
    [1]: https://crates.io/crates/semver
    [2]: http://doc.crates.io/index.html
    [3]: https://www.sabbey.net/litgit
  • version.summary
    Implement `version::parse`
    
    The `version::parse` function takes a semver version string as input and
    returns a `Result` containing either a `Version` struct or an error
    string.
    
    The `Version` struct contains a field for each possible component of a
    semver version string.  The `major`, `minor`, and `patch` fields are all
    required integers.  The `pre` and `build` fields consist of zero or more
    alphanumeric or numeric identifiers.
    
    This is a bit too much code to explain at once, so click the `+` symbol
    on the left for a step-by-step explanation.  Once you're done, click the
    arrow at the top right to continue on to the range parsing function.
    
    *What about that test suite you mentioned?*
    
    There are a lot of tests! To keep things clean, we'll show them all in
    the final step of the tutorial.
  • range.1.summary
    Start out with the range-matching regex
    
    This is going to be a big file, before we get a working version, let's
    just introduce the pieces in smaller chunks to make sure we explain each
    part.
    
    src/lib.rs
    ----------
    
    We need to add the range module to our public interface.
    
    src/range.rs
    ------------
    
    Our use of `lazy_static!` is similar to the way we used it in version
    parsing.  The main differences are we now look for an operation before
    the major version, and each of the major, minor, and patch versions can
    be either a number or a wildcard.
    
    There is no matching for build tags, either.  Apparently build tags
    are not to be used on range specifications...or are they?
  • range.summary
    Implement `range::parse`
    
    The `range::parse` function takes a string containing one or more range
    specifications and returns a `VersionReq` with the parsed
    representation.
    
    The `VersionReq` contains a vector of `Predicate`.  For a semver version
    to match the range, it must match each predicate.  We don't have to do
    the matching, that's handled by the semver crate.  Here, we just need to
    parse the string into the individual predicates and return the result.
    
    `Predicate` is similar to the `Version` struct we worked with in the
    version parsing module.  It contains the major, minor, patch, and pre
    fields, but their types are a little different.  Minor and patch
    versions are now optional, since a range can omit them.
    
    There is also an `Op` field.  Op stands for operator.  Ranges look like
    a version string with an optional operator in front.  The operators are
    defined in the [Cargo documentation][0].  One of the major, minor, or
    patch versions can also be a wildcard.  If a wildcard is used, the
    `Wildcard` operator is chosen.
    
    Let's go through the creation of `range::parse` step-by-step.  Similar
    to our tutorial for `version::parse`, we will start with a partial
    implementation and fill it out as we move along.  By the end, we will be
    passing our entire test suite.
    
    Click the plus sign below to view the step-by-step guide.
    
    *But where are those tests!?*
    
    They're all listed in the final step of this guide.  There are so many
    that they clutter up the diffs and make the guide hard to follow.
    
    [0]:
    http://doc.crates.io/specifying-dependencies.html#specifying-dependencies-from-cratesio
  • version.7.summary
    Finish up, move parse_meta to common.rs
    
    We're almost done! Just some little cleanup things remain.
    
    src/version.rs
    --------------
    
    Turns out our tests and the semver package we're writing this for expect
    the function to be called `parse` rather than `parse_version`.  We'll
    fix that now.
    
    Also, it seems likely that the `parse_meta` function will be useful in the
    range parsing module we will be making in the next step. Let's move that
    to a new file, common.rs.
    
    Now that `parse_meta` livse in another file, we will `use common` and
    call it with `common::parse_meta`.
    
    src/lib.rs
    ----------
    
    Add in the (private) common module.  We don't need to add this to our
    public interface.
    
    src/common.rs
    -------------
    
    Let's make a couple changes to `parse_meta` while we're at it.  First,
    we will change the argument name from `pre` to `s`, since it is used for
    both prerelease and build tags.  We'll also split out the alphanumeric
    check to a separate function.  Finally, we need to accept a plain number 0
    as a Numeric, so a slight adjustment is needed on the regex here.
  • version.6.summary
    Add traits to structs and enums in version.rs
    
    We can make `Identifier` and `Version` more useful by adding some
    traits.  Traits act like interfaces; the compiler knows that a type with
    a trait implemented on it can perform certain operations.
    
    src/version.rs
    --------------
    
    We can add the [Clone][0], [Debug][1], [PartialEq][2], and [Eq][3]
    traits by simply attaching a `derive` attribute.  Rust automatically
    generates the code for these traits for us.
    
    To add the [Display][4] trait, we need to implement it ourselves.  We
    just need to fill out the `fmt` function for each.  The implementation
    for `Version` makes use of the [write! macro][5].
    
    [0]: https://doc.rust-lang.org/core/clone/trait.Clone.html
    [1]: https://doc.rust-lang.org/core/fmt/trait.Debug.html
    [2]: https://doc.rust-lang.org/core/cmp/trait.PartialEq.html
    [3]: https://doc.rust-lang.org/core/cmp/trait.Eq.html
    [4]: https://doc.rust-lang.org/std/fmt/trait.Display.html
    [5]: https://doc.rust-lang.org/std/fmt/#write
  • test.summary
    Add tests
    
    Here they are! All the tests.
    
    We're done implementing everything.  All the tests pass.  We made a Rust
    crate!
  • range.7.summary
    Add build tag to regex
    
    As hinted earlier, apparently we do need to allow for a build string in
    the range specification.
    
    src/range.rs
    ------------
    
    If there was a build string in the predicate, we'll at least accept it.
    So far we aren't doing anything with this capture group, but it's there.
  • range.6.summary
    Add support for multiple predicates
    
    We're supposed to be able to handle multiple predicates, but so far our
    code only works with one.
    
    src/range.rs
    ------------
    
    Let's split the parse function up.  Most of what we already had will be
    called `parse_predicate`.  It works on one predicate at a time.
    
    We'll use `parse` to split apart the string on commas and send each
    portion to `parse_predicate`. A quick bit of error checking verifies
    that we have at least one predicate, then we can return our
    `VersionReq`.
  • range.5.summary
    Add major, minor, patch, and pre support
    
    So much for splitting this into multiple steps! Turns out, there is a
    lot of repetition going on here.  We can describe what's going on in a
    lot less space than the large diff takes up.
    
    src/range.rs
    ------------
    
    First, we run the regex on our string and get the `captures` variable.
    Then we get a variable out of each field we captured.  Finally, we build
    a `Predicate` with them and return it.
    
    In gathering the operation, the `map(str::parse)` call runs the
    `FromStr` code we implemented in a previous step.  If no operation is
    listed, we use the default, `Op::Compatible`.
    
    Parsing the major version number might result in an error such as
    integer overflow, so we need to be a little careful with a match instead
    of just unwrapping.
    
    The minor and patch fields go through very similar steps.  They are both
    optional and could also encounter errors, so multiple layers of match
    are used.  The innermost match checks if the version was a wildcard. If
    it was, we override the `Op` that was set, instead changing it to the
    applicable `WildcardVersion`.
  • range.4.summary
    Add the parsing function, with just major wildcard support
    
    Finally, it's here! The function that does the real work.  We'll put
    this together in a couple steps to keep things simple.
    
    src/range.rs
    ------------
    
    So, we parse the range string into a `Result<VersionReq, String>`.  This
    is similar to the signature of `version::Parse`.
    
    If the string was null, we'll just respond with an error.
    
    The only kind of version we'll accept for now is a wildcard in the major
    version number.  This can be represented in four different ways.  We
    just create a single-element vector with a `Predicate` inside and return
    it.
  • range.3.summary
    Add traits for range parsing
    
    Implementing traits will give our new structs and enums some extra
    behavior.  Most of them come along automatically!
    
    src/range.rs
    ------------
    
    We can derive Debug and PartialEq with just the derive annotation.
    
    The [FromStr][0] trait lets us turn a string into a value of the type.
    So when we find an equal sign, we can automatically turn it into the
    `Op::Ex` operator.
    
    [0]: https://doc.rust-lang.org/std/str/trait.FromStr.html
  • range.2.summary
    Add enums and structs for range parsing
    
    Let's define all the nifty types we'll need!
    
    src/range.rs
    ------------
    
    We've described some of these back in the introduction to range parsing,
    but a little more won't hurt.
    
    `VersionReq` contains one or more predicates.  This is what we need to
    supply to the semver crate.
    
    `Predicate` is similar to the `Version` struct we used in the version
    parsing module, but the types are slightly different. We have added the
    Op field and the minor and patch versions are now optional.
    
    `Op` stands for operator.  The operator defines how to match a specific
    version to this predicate.
    
    `WildcardVersion` is used for the `Wildcard` operator.  There might be a
    wildcard at the major, minor, or patch version.  Since they are numeric
    types, we don't assign `Wildcard` to the field directly, instead we
    list it as the operator.
    
    We'll also pull in `version::Identifier`, for use with the prerelease
    strings.  I wonder if that should have been `common::Identifier`. Hmm.
  • version.5.summary
    Properly parse prerelease and build tags
    
    Time to fix the types on our prerelease and build tags.  Instead of just
    reading the whole thing into a string, we will break them apart on
    periods into either alphanumeric or numeric identifiers.
    
    src/version.rs
    --------------
    
    Semver's prerelease and build tag [specification][0] describes tags as
    either alphanumeric or numeric.  Instead of using the type
    `Option<Vec<String>>` for the `pre` and `build` fields, let's create the
    `Identifier` enum.  Identifiers may be either `AlphaNumeric` or
    `Numeric`.
    
    Note that we removed the `Option` around the type of `pre` and `build`.
    Our tests expect just `Vec<Identifier>`, so we will return an empty
    vector if nothing is present, rather than a `None`.  The
    `unwrap_or(vec![])` on the lines assigning `pre` and `build` assigns
    them empty vectors when no match was found.
    
    Since we have to parse out the metadata twice, we'll use the
    `parse_meta` function.  The operation is fairly straightforward. Just
    split on periods. If all the characters in the string are digits, call
    it a Numeric, otherwise call it AlphaNumeric.
    
    [0]: http://semver.org/#spec-item-9
  • version.4.summary
    Add support for build tags
    
    This should look familiar.
    
    The [build tag][0] is denoted by a plus sign followed by a series of
    dot-separated identifiers.  We'll use the exact same pattern we followed
    to add prerelease tag support.
    
    [0]: http://semver.org/#spec-item-10
  • version.3.summary
    Add support for prerelease tags
    
    Semver allows [prerelease tags][0], denoted by appending a hyphen and a
    series of dot-separated identifiers immediately following the patch
    version.  Let's add support for them!  Or at least partial support.
    This might take a couple steps.
    
    src/version.rs
    --------------
    
    The allowed characters in the prerelease label are alphanumerics,
    hyphens, and periods.  We'll call that `letters_numbers_dash_dot` and
    tack on an optional prerelease version in our regex.  Since that line
    was getting a bit long, we'll add *x mode*, which allows us to describe
    the regex over multiple lines, with comments for each important bit.
    
    We also have to add the `pre` field to the `Version` struct and capture
    it from the regex.  For now, we're just reading the whole string into an
    optional single-element vector.  Later on, we will have to break it
    apart into individual dot-separated identifiers.
    
    [0]: http://semver.org/#spec-item-9
  • version.2.summary
    Disallow leading zeros and return Result
    
    When we started applying our test suite to the result of the previous
    step, we found an error.  Time to fix it!
    
    src/lib.rs
    ----------
    
    Remember the `lazy_static` dependency we listed in `Cargo.toml`?  The
    [lazy static][0] crate gives us a macro that allows us to declare
    statics that are evaluated at runtime.
    
    *What?*
    
    A static in rust is like a global constant.  Statics live for the
    entire lifetime of your program at a fixed memory location.  Normally
    they are limited to simple values, but with the `lazy_static!` macro, we
    can create statics that require function calls or heap allocations.
    We'll use it to make a static `Regex` that matches semver version strings.
    
    src/version.rs
    --------------
    
    There are two separate things happening at once here.  We made some
    upgrades to our regex, then changed our return type.
    
    First, we moved the regex creation up out of `parse_version` into the
    `lazy_static!` macro.  The result of the macro is the static called
    `REGEX` which we can use to match version strings.
    
    The regex declaration went from one line to over a dozen, and we
    switched from matching on `\d+` to matching on the more sophisticated
    `numeric_identifier` pattern.  This pattern allows a plain zero or a
    number starting with any other digit, but not a number with a leading
    zero.
    
    Back in `parse_version`, we have changed from an `unwrap()` on
    `REGEX.captures()` to a match.  The match lets us return an error string
    when the supplied version number is not matched by the regex.  In the
    version of our program from the previous step, any error in matching the
    regex would have caused a panic.
    
    Since we can now return either a `Version` on success or a `String` on
    an error, we have changed the return type from `Version` to
    `Result<Version, String>`.  When our result is an error, we `return
    Err(...)`, and when the result is a Version, we `return Ok(Version...)`.
    
    *How did you pass any tests if you had the wrong return type at the
    previous step!*
    
    This is hard, we'll get there eventually!
    
    [0]: https://crates.io/crates/lazy_static
  • version.1.summary
    Humble beginnings
    
    Time to write some Rust code!  It's all very basic for now, but we have
    to start somewhere.
    
    src/lib.rs
    ----------
    
    Since this is a library crate, we list our public modules in
    `src/lib.rs`.  For now, that's just our `version` module.  While we're
    at it, let's pull in the `regex` crate we listed as a dependency.  If we
    list it here, we can use it from any future modules, like the range
    parsing one we will make in the next step.
    
    src/version.rs
    --------------
    
    There are three main components here.
    
    On the first line, we bring the `Regex` module into our local scope so
    we can refer to it with a short name.
    
    Next, we introduce a basic `Version` struct.  If you're reading
    carefully, you'll see we don't even have the `pre` and `build` fields
    yet.  Let's just start here and add those in later.  We don't have to
    pass all the tests until we're done!
    
    Note that `Version` and its fields are all marked public.  Unlike some
    other languages, in Rust you must mark each field of a struct public if
    you want to expose it.  Our users will be able to use the struct and all
    its fields.
    
    The `parse_version` function uses `Regex` to extract the major, minor, and
    patch levels from a version string.  This isn't robust enough to handle
    all of semver yet, but we have a mostly working module in under 20 lines
    of code.
    
    Are you worried about all those calls to `unwrap()`? You should be! If
    an error occurs, the `unwrap()` will panic and crash our program.  We'll
    have to implement error handling at some point, but for now let's just
    keep things simple.
    
    *But the function is supposed to be `version::parse`, you called it
    `parse_version`!*
    
    Oops, we'll have to fix that in a future step.
  • start
    093f4443 · Base commit ·
  • test.16
    f6212a69 · Add test ·