Tags · Scott Abbey / semver-parser

Tags give the ability to mark specific points in history as being important

Set up basic project structure and metadata

Welcome to the semver-parser tutorial!

*This tutorial is an example of a "literate git" project.  For an
explanation of the motivations for this project, view the report
[here][3].*

Our goal is to build a Rust library that can parse [semver][0] version
strings.  The maintainer of the [semver crate][1] has decided to split
off the parsing operations to a separate crate, and he assigned us to
implement it.

We'll document both the steps we take to build the crate and the reasons
(or lack thereof) behind our implementation choices.  When someone comes
along later with an idea for improving our program, they can read our
tutorial for more information about the project's structure and history.

There are two main public functions we need to expose to be able to plug
in to the semver crate.  The `version::parse` function transforms a
version string into a struct with individual fields for each part of the
version number.  The `range::parse` function takes a string that
specifies a range of matching versions and returns a struct containing
one or more predicates which can be compared to individual versions.

Lucky for us, there is already a large set of test cases that we can use
to verify our results.  Since the function names and return types are
already mostly taken care of, we can just focus on passing those tests.
Ah, the life of a junior programmer!

Before we start putting together our functions, let's do some basic
setup. We're going to use the [Cargo][2] package manager and build
system to help streamline development, since the semver crate is already
using it.

Check out the initial project structure below, then click the arrow
above to move on to the next step.

.gitignore
----------

Cargo automatically creates a basic .gitignore file for us.  The target
directory is where build artifacts land, so we don't want to be tracking
that.  Because this is a library, we also ignore the Cargo.lock file.
If we were building an executable application, we would not ignore it.

Cargo.toml
----------

The Cargo.toml file contains metadata about our crate.  This includes
links to related resources, dependency lists, license information, and
many other possible fields.  Since we have great foresight, we'll just
fill this all in right now and forget about it.  If we didn't have a
crystal ball, we'd be updating this along the way.

[0]: http://semver.org/
[1]: https://crates.io/crates/semver
[2]: http://doc.crates.io/index.html
[3]: https://www.sabbey.net/litgit

version.summary

11d5c576 · Merge branch 'version.final' into final · Feb 09, 2017

Implement `version::parse`

The `version::parse` function takes a semver version string as input and
returns a `Result` containing either a `Version` struct or an error
string.

The `Version` struct contains a field for each possible component of a
semver version string.  The `major`, `minor`, and `patch` fields are all
required integers.  The `pre` and `build` fields consist of zero or more
alphanumeric or numeric identifiers.

This is a bit too much code to explain at once, so click the `+` symbol
on the left for a step-by-step explanation.  Once you're done, click the
arrow at the top right to continue on to the range parsing function.

*What about that test suite you mentioned?*

There are a lot of tests! To keep things clean, we'll show them all in
the final step of the tutorial.

range.1.summary

faad6219 · Merge branch 'range.1.final' into range.final · Feb 10, 2017

Start out with the range-matching regex

This is going to be a big file, before we get a working version, let's
just introduce the pieces in smaller chunks to make sure we explain each
part.

src/lib.rs
----------

We need to add the range module to our public interface.

src/range.rs
------------

Our use of `lazy_static!` is similar to the way we used it in version
parsing.  The main differences are we now look for an operation before
the major version, and each of the major, minor, and patch versions can
be either a number or a wildcard.

There is no matching for build tags, either.  Apparently build tags
are not to be used on range specifications...or are they?

range.summary

73d5cf3f · Merge branch 'range.final' into final · Feb 10, 2017

Implement `range::parse`

The `range::parse` function takes a string containing one or more range
specifications and returns a `VersionReq` with the parsed
representation.

The `VersionReq` contains a vector of `Predicate`.  For a semver version
to match the range, it must match each predicate.  We don't have to do
the matching, that's handled by the semver crate.  Here, we just need to
parse the string into the individual predicates and return the result.

`Predicate` is similar to the `Version` struct we worked with in the
version parsing module.  It contains the major, minor, patch, and pre
fields, but their types are a little different.  Minor and patch
versions are now optional, since a range can omit them.

There is also an `Op` field.  Op stands for operator.  Ranges look like
a version string with an optional operator in front.  The operators are
defined in the [Cargo documentation][0].  One of the major, minor, or
patch versions can also be a wildcard.  If a wildcard is used, the
`Wildcard` operator is chosen.

Let's go through the creation of `range::parse` step-by-step.  Similar
to our tutorial for `version::parse`, we will start with a partial
implementation and fill it out as we move along.  By the end, we will be
passing our entire test suite.

Click the plus sign below to view the step-by-step guide.

*But where are those tests!?*

They're all listed in the final step of this guide.  There are so many
that they clutter up the diffs and make the guide hard to follow.

[0]:
http://doc.crates.io/specifying-dependencies.html#specifying-dependencies-from-cratesio

version.7.summary

f4a3d988 · Merge branch 'version.7.final' into version.final · Feb 09, 2017

Finish up, move parse_meta to common.rs

We're almost done! Just some little cleanup things remain.

src/version.rs
--------------

Turns out our tests and the semver package we're writing this for expect
the function to be called `parse` rather than `parse_version`.  We'll
fix that now.

Also, it seems likely that the `parse_meta` function will be useful in the
range parsing module we will be making in the next step. Let's move that
to a new file, common.rs.

Now that `parse_meta` livse in another file, we will `use common` and
call it with `common::parse_meta`.

src/lib.rs
----------

Add in the (private) common module.  We don't need to add this to our
public interface.

src/common.rs
-------------

Let's make a couple changes to `parse_meta` while we're at it.  First,
we will change the argument name from `pre` to `s`, since it is used for
both prerelease and build tags.  We'll also split out the alphanumeric
check to a separate function.  Finally, we need to accept a plain number 0
as a Numeric, so a slight adjustment is needed on the regex here.

version.6.summary

ed57f68c · Merge branch 'version.6.final' into version.final · Feb 09, 2017

Add traits to structs and enums in version.rs

We can make `Identifier` and `Version` more useful by adding some
traits.  Traits act like interfaces; the compiler knows that a type with
a trait implemented on it can perform certain operations.

src/version.rs
--------------

We can add the [Clone][0], [Debug][1], [PartialEq][2], and [Eq][3]
traits by simply attaching a `derive` attribute.  Rust automatically
generates the code for these traits for us.

To add the [Display][4] trait, we need to implement it ourselves.  We
just need to fill out the `fmt` function for each.  The implementation
for `Version` makes use of the [write! macro][5].

[0]: https://doc.rust-lang.org/core/clone/trait.Clone.html
[1]: https://doc.rust-lang.org/core/fmt/trait.Debug.html
[2]: https://doc.rust-lang.org/core/cmp/trait.PartialEq.html
[3]: https://doc.rust-lang.org/core/cmp/trait.Eq.html
[4]: https://doc.rust-lang.org/std/fmt/trait.Display.html
[5]: https://doc.rust-lang.org/std/fmt/#write

test.summary

8d7c0644 · Merge branch 'test.final' into final · Feb 10, 2017

Add tests

Here they are! All the tests.

We're done implementing everything.  All the tests pass.  We made a Rust
crate!

range.7.summary

b8493512 · Merge branch 'range.7.final' into range.final · Feb 10, 2017

Add build tag to regex

As hinted earlier, apparently we do need to allow for a build string in
the range specification.

src/range.rs
------------

If there was a build string in the predicate, we'll at least accept it.
So far we aren't doing anything with this capture group, but it's there.

range.6.summary

2dfeb0d1 · Merge branch 'range.6.final' into range.final · Feb 10, 2017

Add support for multiple predicates

We're supposed to be able to handle multiple predicates, but so far our
code only works with one.

src/range.rs
------------

Let's split the parse function up.  Most of what we already had will be
called `parse_predicate`.  It works on one predicate at a time.

We'll use `parse` to split apart the string on commas and send each
portion to `parse_predicate`. A quick bit of error checking verifies
that we have at least one predicate, then we can return our
`VersionReq`.

range.5.summary

0565d1ce · Merge branch 'range.5.final' into range.final · Feb 10, 2017

Add major, minor, patch, and pre support

So much for splitting this into multiple steps! Turns out, there is a
lot of repetition going on here.  We can describe what's going on in a
lot less space than the large diff takes up.

src/range.rs
------------

First, we run the regex on our string and get the `captures` variable.
Then we get a variable out of each field we captured.  Finally, we build
a `Predicate` with them and return it.

In gathering the operation, the `map(str::parse)` call runs the
`FromStr` code we implemented in a previous step.  If no operation is
listed, we use the default, `Op::Compatible`.

Parsing the major version number might result in an error such as
integer overflow, so we need to be a little careful with a match instead
of just unwrapping.

The minor and patch fields go through very similar steps.  They are both
optional and could also encounter errors, so multiple layers of match
are used.  The innermost match checks if the version was a wildcard. If
it was, we override the `Op` that was set, instead changing it to the
applicable `WildcardVersion`.

range.4.summary

0ff09124 · Merge branch 'range.4.final' into range.final · Feb 10, 2017

Add the parsing function, with just major wildcard support

Finally, it's here! The function that does the real work.  We'll put
this together in a couple steps to keep things simple.

src/range.rs
------------

So, we parse the range string into a `Result<VersionReq, String>`.  This
is similar to the signature of `version::Parse`.

If the string was null, we'll just respond with an error.

The only kind of version we'll accept for now is a wildcard in the major
version number.  This can be represented in four different ways.  We
just create a single-element vector with a `Predicate` inside and return
it.

range.3.summary

e05191ea · Merge branch 'range.3.final' into range.final · Feb 10, 2017

Add traits for range parsing

Implementing traits will give our new structs and enums some extra
behavior.  Most of them come along automatically!

src/range.rs
------------

We can derive Debug and PartialEq with just the derive annotation.

The [FromStr][0] trait lets us turn a string into a value of the type.
So when we find an equal sign, we can automatically turn it into the
`Op::Ex` operator.

[0]: https://doc.rust-lang.org/std/str/trait.FromStr.html

range.2.summary

542cf691 · Merge branch 'range.2.final' into range.final · Feb 10, 2017

Add enums and structs for range parsing

Let's define all the nifty types we'll need!

src/range.rs
------------

We've described some of these back in the introduction to range parsing,
but a little more won't hurt.

`VersionReq` contains one or more predicates.  This is what we need to
supply to the semver crate.

`Predicate` is similar to the `Version` struct we used in the version
parsing module, but the types are slightly different. We have added the
Op field and the minor and patch versions are now optional.

`Op` stands for operator.  The operator defines how to match a specific
version to this predicate.

`WildcardVersion` is used for the `Wildcard` operator.  There might be a
wildcard at the major, minor, or patch version.  Since they are numeric
types, we don't assign `Wildcard` to the field directly, instead we
list it as the operator.

We'll also pull in `version::Identifier`, for use with the prerelease
strings.  I wonder if that should have been `common::Identifier`. Hmm.

version.5.summary

7e2655c1 · Merge branch 'version.5.final' into version.final · Feb 09, 2017

Properly parse prerelease and build tags

Time to fix the types on our prerelease and build tags.  Instead of just
reading the whole thing into a string, we will break them apart on
periods into either alphanumeric or numeric identifiers.

src/version.rs
--------------

Semver's prerelease and build tag [specification][0] describes tags as
either alphanumeric or numeric.  Instead of using the type
`Option<Vec<String>>` for the `pre` and `build` fields, let's create the
`Identifier` enum.  Identifiers may be either `AlphaNumeric` or
`Numeric`.

Note that we removed the `Option` around the type of `pre` and `build`.
Our tests expect just `Vec<Identifier>`, so we will return an empty
vector if nothing is present, rather than a `None`.  The
`unwrap_or(vec![])` on the lines assigning `pre` and `build` assigns
them empty vectors when no match was found.

Since we have to parse out the metadata twice, we'll use the
`parse_meta` function.  The operation is fairly straightforward. Just
split on periods. If all the characters in the string are digits, call
it a Numeric, otherwise call it AlphaNumeric.

[0]: http://semver.org/#spec-item-9

version.4.summary

928f58e4 · Merge branch 'version.4.final' into version.final · Feb 09, 2017

Add support for build tags

This should look familiar.

The [build tag][0] is denoted by a plus sign followed by a series of
dot-separated identifiers.  We'll use the exact same pattern we followed
to add prerelease tag support.

[0]: http://semver.org/#spec-item-10

version.3.summary

6460af5c · Merge branch 'version.3.final' into version.final · Feb 09, 2017

Add support for prerelease tags

Semver allows [prerelease tags][0], denoted by appending a hyphen and a
series of dot-separated identifiers immediately following the patch
version.  Let's add support for them!  Or at least partial support.
This might take a couple steps.

src/version.rs
--------------

The allowed characters in the prerelease label are alphanumerics,
hyphens, and periods.  We'll call that `letters_numbers_dash_dot` and
tack on an optional prerelease version in our regex.  Since that line
was getting a bit long, we'll add *x mode*, which allows us to describe
the regex over multiple lines, with comments for each important bit.

We also have to add the `pre` field to the `Version` struct and capture
it from the regex.  For now, we're just reading the whole string into an
optional single-element vector.  Later on, we will have to break it
apart into individual dot-separated identifiers.

[0]: http://semver.org/#spec-item-9

version.2.summary

c4b40838 · Merge branch 'version.2.final' into version.final · Feb 09, 2017

Disallow leading zeros and return Result

When we started applying our test suite to the result of the previous
step, we found an error.  Time to fix it!

src/lib.rs
----------

Remember the `lazy_static` dependency we listed in `Cargo.toml`?  The
[lazy static][0] crate gives us a macro that allows us to declare
statics that are evaluated at runtime.

*What?*

A static in rust is like a global constant.  Statics live for the
entire lifetime of your program at a fixed memory location.  Normally
they are limited to simple values, but with the `lazy_static!` macro, we
can create statics that require function calls or heap allocations.
We'll use it to make a static `Regex` that matches semver version strings.

src/version.rs
--------------

There are two separate things happening at once here.  We made some
upgrades to our regex, then changed our return type.

First, we moved the regex creation up out of `parse_version` into the
`lazy_static!` macro.  The result of the macro is the static called
`REGEX` which we can use to match version strings.

The regex declaration went from one line to over a dozen, and we
switched from matching on `\d+` to matching on the more sophisticated
`numeric_identifier` pattern.  This pattern allows a plain zero or a
number starting with any other digit, but not a number with a leading
zero.

Back in `parse_version`, we have changed from an `unwrap()` on
`REGEX.captures()` to a match.  The match lets us return an error string
when the supplied version number is not matched by the regex.  In the
version of our program from the previous step, any error in matching the
regex would have caused a panic.

Since we can now return either a `Version` on success or a `String` on
an error, we have changed the return type from `Version` to
`Result<Version, String>`.  When our result is an error, we `return
Err(...)`, and when the result is a Version, we `return Ok(Version...)`.

*How did you pass any tests if you had the wrong return type at the
previous step!*

This is hard, we'll get there eventually!

[0]: https://crates.io/crates/lazy_static

version.1.summary

a108f6d4 · Merge branch 'version.1.final' into version.final · Feb 09, 2017

Humble beginnings

Time to write some Rust code!  It's all very basic for now, but we have
to start somewhere.

src/lib.rs
----------

Since this is a library crate, we list our public modules in
`src/lib.rs`.  For now, that's just our `version` module.  While we're
at it, let's pull in the `regex` crate we listed as a dependency.  If we
list it here, we can use it from any future modules, like the range
parsing one we will make in the next step.

src/version.rs
--------------

There are three main components here.

On the first line, we bring the `Regex` module into our local scope so
we can refer to it with a short name.

Next, we introduce a basic `Version` struct.  If you're reading
carefully, you'll see we don't even have the `pre` and `build` fields
yet.  Let's just start here and add those in later.  We don't have to
pass all the tests until we're done!

Note that `Version` and its fields are all marked public.  Unlike some
other languages, in Rust you must mark each field of a struct public if
you want to expose it.  Our users will be able to use the struct and all
its fields.

The `parse_version` function uses `Regex` to extract the major, minor, and
patch levels from a version string.  This isn't robust enough to handle
all of semver yet, but we have a mostly working module in under 20 lines
of code.

Are you worried about all those calls to `unwrap()`? You should be! If
an error occurs, the `unwrap()` will panic and crash our program.  We'll
have to implement error handling at some point, but for now let's just
keep things simple.

*But the function is supposed to be `version::parse`, you called it
`parse_version`!*

Oops, we'll have to fix that in a future step.

start

093f4443 · Base commit · Feb 09, 2017
test.16

f6212a69 · Add test · Feb 09, 2017