-
misc.summary
Set up basic project structure and metadata Welcome to the semver-parser tutorial! *This tutorial is an example of a "literate git" project. For an explanation of the motivations for this project, view the report [here][3].* Our goal is to build a Rust library that can parse [semver][0] version strings. The maintainer of the [semver crate][1] has decided to split off the parsing operations to a separate crate, and he assigned us to implement it. We'll document both the steps we take to build the crate and the reasons (or lack thereof) behind our implementation choices. When someone comes along later with an idea for improving our program, they can read our tutorial for more information about the project's structure and history. There are two main public functions we need to expose to be able to plug in to the semver crate. The `version::parse` function transforms a version string into a struct with individual fields for each part of the version number. The `range::parse` function takes a string that specifies a range of matching versions and returns a struct containing one or more predicates which can be compared to individual versions. Lucky for us, there is already a large set of test cases that we can use to verify our results. Since the function names and return types are already mostly taken care of, we can just focus on passing those tests. Ah, the life of a junior programmer! Before we start putting together our functions, let's do some basic setup. We're going to use the [Cargo][2] package manager and build system to help streamline development, since the semver crate is already using it. Check out the initial project structure below, then click the arrow above to move on to the next step. .gitignore ---------- Cargo automatically creates a basic .gitignore file for us. The target directory is where build artifacts land, so we don't want to be tracking that. Because this is a library, we also ignore the Cargo.lock file. If we were building an executable application, we would not ignore it. Cargo.toml ---------- The Cargo.toml file contains metadata about our crate. This includes links to related resources, dependency lists, license information, and many other possible fields. Since we have great foresight, we'll just fill this all in right now and forget about it. If we didn't have a crystal ball, we'd be updating this along the way. [0]: http://semver.org/ [1]: https://crates.io/crates/semver [2]: http://doc.crates.io/index.html [3]: https://www.sabbey.net/litgit
-
version.summary
Implement `version::parse` The `version::parse` function takes a semver version string as input and returns a `Result` containing either a `Version` struct or an error string. The `Version` struct contains a field for each possible component of a semver version string. The `major`, `minor`, and `patch` fields are all required integers. The `pre` and `build` fields consist of zero or more alphanumeric or numeric identifiers. This is a bit too much code to explain at once, so click the `+` symbol on the left for a step-by-step explanation. Once you're done, click the arrow at the top right to continue on to the range parsing function. *What about that test suite you mentioned?* There are a lot of tests! To keep things clean, we'll show them all in the final step of the tutorial.
-
range.1.summary
Start out with the range-matching regex This is going to be a big file, before we get a working version, let's just introduce the pieces in smaller chunks to make sure we explain each part. src/lib.rs ---------- We need to add the range module to our public interface. src/range.rs ------------ Our use of `lazy_static!` is similar to the way we used it in version parsing. The main differences are we now look for an operation before the major version, and each of the major, minor, and patch versions can be either a number or a wildcard. There is no matching for build tags, either. Apparently build tags are not to be used on range specifications...or are they?
-
range.summary
Implement `range::parse` The `range::parse` function takes a string containing one or more range specifications and returns a `VersionReq` with the parsed representation. The `VersionReq` contains a vector of `Predicate`. For a semver version to match the range, it must match each predicate. We don't have to do the matching, that's handled by the semver crate. Here, we just need to parse the string into the individual predicates and return the result. `Predicate` is similar to the `Version` struct we worked with in the version parsing module. It contains the major, minor, patch, and pre fields, but their types are a little different. Minor and patch versions are now optional, since a range can omit them. There is also an `Op` field. Op stands for operator. Ranges look like a version string with an optional operator in front. The operators are defined in the [Cargo documentation][0]. One of the major, minor, or patch versions can also be a wildcard. If a wildcard is used, the `Wildcard` operator is chosen. Let's go through the creation of `range::parse` step-by-step. Similar to our tutorial for `version::parse`, we will start with a partial implementation and fill it out as we move along. By the end, we will be passing our entire test suite. Click the plus sign below to view the step-by-step guide. *But where are those tests!?* They're all listed in the final step of this guide. There are so many that they clutter up the diffs and make the guide hard to follow. [0]: http://doc.crates.io/specifying-dependencies.html#specifying-dependencies-from-cratesio
-
version.7.summary
Finish up, move parse_meta to common.rs We're almost done! Just some little cleanup things remain. src/version.rs -------------- Turns out our tests and the semver package we're writing this for expect the function to be called `parse` rather than `parse_version`. We'll fix that now. Also, it seems likely that the `parse_meta` function will be useful in the range parsing module we will be making in the next step. Let's move that to a new file, common.rs. Now that `parse_meta` livse in another file, we will `use common` and call it with `common::parse_meta`. src/lib.rs ---------- Add in the (private) common module. We don't need to add this to our public interface. src/common.rs ------------- Let's make a couple changes to `parse_meta` while we're at it. First, we will change the argument name from `pre` to `s`, since it is used for both prerelease and build tags. We'll also split out the alphanumeric check to a separate function. Finally, we need to accept a plain number 0 as a Numeric, so a slight adjustment is needed on the regex here.
-
version.6.summary
Add traits to structs and enums in version.rs We can make `Identifier` and `Version` more useful by adding some traits. Traits act like interfaces; the compiler knows that a type with a trait implemented on it can perform certain operations. src/version.rs -------------- We can add the [Clone][0], [Debug][1], [PartialEq][2], and [Eq][3] traits by simply attaching a `derive` attribute. Rust automatically generates the code for these traits for us. To add the [Display][4] trait, we need to implement it ourselves. We just need to fill out the `fmt` function for each. The implementation for `Version` makes use of the [write! macro][5]. [0]: https://doc.rust-lang.org/core/clone/trait.Clone.html [1]: https://doc.rust-lang.org/core/fmt/trait.Debug.html [2]: https://doc.rust-lang.org/core/cmp/trait.PartialEq.html [3]: https://doc.rust-lang.org/core/cmp/trait.Eq.html [4]: https://doc.rust-lang.org/std/fmt/trait.Display.html [5]: https://doc.rust-lang.org/std/fmt/#write
-
test.summary
Add tests Here they are! All the tests. We're done implementing everything. All the tests pass. We made a Rust crate!
-
range.7.summary
Add build tag to regex As hinted earlier, apparently we do need to allow for a build string in the range specification. src/range.rs ------------ If there was a build string in the predicate, we'll at least accept it. So far we aren't doing anything with this capture group, but it's there.
-
range.6.summary
Add support for multiple predicates We're supposed to be able to handle multiple predicates, but so far our code only works with one. src/range.rs ------------ Let's split the parse function up. Most of what we already had will be called `parse_predicate`. It works on one predicate at a time. We'll use `parse` to split apart the string on commas and send each portion to `parse_predicate`. A quick bit of error checking verifies that we have at least one predicate, then we can return our `VersionReq`.
-
range.5.summary
Add major, minor, patch, and pre support So much for splitting this into multiple steps! Turns out, there is a lot of repetition going on here. We can describe what's going on in a lot less space than the large diff takes up. src/range.rs ------------ First, we run the regex on our string and get the `captures` variable. Then we get a variable out of each field we captured. Finally, we build a `Predicate` with them and return it. In gathering the operation, the `map(str::parse)` call runs the `FromStr` code we implemented in a previous step. If no operation is listed, we use the default, `Op::Compatible`. Parsing the major version number might result in an error such as integer overflow, so we need to be a little careful with a match instead of just unwrapping. The minor and patch fields go through very similar steps. They are both optional and could also encounter errors, so multiple layers of match are used. The innermost match checks if the version was a wildcard. If it was, we override the `Op` that was set, instead changing it to the applicable `WildcardVersion`.
-
range.4.summary
Add the parsing function, with just major wildcard support Finally, it's here! The function that does the real work. We'll put this together in a couple steps to keep things simple. src/range.rs ------------ So, we parse the range string into a `Result<VersionReq, String>`. This is similar to the signature of `version::Parse`. If the string was null, we'll just respond with an error. The only kind of version we'll accept for now is a wildcard in the major version number. This can be represented in four different ways. We just create a single-element vector with a `Predicate` inside and return it.
-
range.3.summary
Add traits for range parsing Implementing traits will give our new structs and enums some extra behavior. Most of them come along automatically! src/range.rs ------------ We can derive Debug and PartialEq with just the derive annotation. The [FromStr][0] trait lets us turn a string into a value of the type. So when we find an equal sign, we can automatically turn it into the `Op::Ex` operator. [0]: https://doc.rust-lang.org/std/str/trait.FromStr.html
-
range.2.summary
Add enums and structs for range parsing Let's define all the nifty types we'll need! src/range.rs ------------ We've described some of these back in the introduction to range parsing, but a little more won't hurt. `VersionReq` contains one or more predicates. This is what we need to supply to the semver crate. `Predicate` is similar to the `Version` struct we used in the version parsing module, but the types are slightly different. We have added the Op field and the minor and patch versions are now optional. `Op` stands for operator. The operator defines how to match a specific version to this predicate. `WildcardVersion` is used for the `Wildcard` operator. There might be a wildcard at the major, minor, or patch version. Since they are numeric types, we don't assign `Wildcard` to the field directly, instead we list it as the operator. We'll also pull in `version::Identifier`, for use with the prerelease strings. I wonder if that should have been `common::Identifier`. Hmm.
-
version.5.summary
Properly parse prerelease and build tags Time to fix the types on our prerelease and build tags. Instead of just reading the whole thing into a string, we will break them apart on periods into either alphanumeric or numeric identifiers. src/version.rs -------------- Semver's prerelease and build tag [specification][0] describes tags as either alphanumeric or numeric. Instead of using the type `Option<Vec<String>>` for the `pre` and `build` fields, let's create the `Identifier` enum. Identifiers may be either `AlphaNumeric` or `Numeric`. Note that we removed the `Option` around the type of `pre` and `build`. Our tests expect just `Vec<Identifier>`, so we will return an empty vector if nothing is present, rather than a `None`. The `unwrap_or(vec![])` on the lines assigning `pre` and `build` assigns them empty vectors when no match was found. Since we have to parse out the metadata twice, we'll use the `parse_meta` function. The operation is fairly straightforward. Just split on periods. If all the characters in the string are digits, call it a Numeric, otherwise call it AlphaNumeric. [0]: http://semver.org/#spec-item-9
-
version.4.summary
Add support for build tags This should look familiar. The [build tag][0] is denoted by a plus sign followed by a series of dot-separated identifiers. We'll use the exact same pattern we followed to add prerelease tag support. [0]: http://semver.org/#spec-item-10
-
version.3.summary
Add support for prerelease tags Semver allows [prerelease tags][0], denoted by appending a hyphen and a series of dot-separated identifiers immediately following the patch version. Let's add support for them! Or at least partial support. This might take a couple steps. src/version.rs -------------- The allowed characters in the prerelease label are alphanumerics, hyphens, and periods. We'll call that `letters_numbers_dash_dot` and tack on an optional prerelease version in our regex. Since that line was getting a bit long, we'll add *x mode*, which allows us to describe the regex over multiple lines, with comments for each important bit. We also have to add the `pre` field to the `Version` struct and capture it from the regex. For now, we're just reading the whole string into an optional single-element vector. Later on, we will have to break it apart into individual dot-separated identifiers. [0]: http://semver.org/#spec-item-9
-
version.2.summary
Disallow leading zeros and return Result When we started applying our test suite to the result of the previous step, we found an error. Time to fix it! src/lib.rs ---------- Remember the `lazy_static` dependency we listed in `Cargo.toml`? The [lazy static][0] crate gives us a macro that allows us to declare statics that are evaluated at runtime. *What?* A static in rust is like a global constant. Statics live for the entire lifetime of your program at a fixed memory location. Normally they are limited to simple values, but with the `lazy_static!` macro, we can create statics that require function calls or heap allocations. We'll use it to make a static `Regex` that matches semver version strings. src/version.rs -------------- There are two separate things happening at once here. We made some upgrades to our regex, then changed our return type. First, we moved the regex creation up out of `parse_version` into the `lazy_static!` macro. The result of the macro is the static called `REGEX` which we can use to match version strings. The regex declaration went from one line to over a dozen, and we switched from matching on `\d+` to matching on the more sophisticated `numeric_identifier` pattern. This pattern allows a plain zero or a number starting with any other digit, but not a number with a leading zero. Back in `parse_version`, we have changed from an `unwrap()` on `REGEX.captures()` to a match. The match lets us return an error string when the supplied version number is not matched by the regex. In the version of our program from the previous step, any error in matching the regex would have caused a panic. Since we can now return either a `Version` on success or a `String` on an error, we have changed the return type from `Version` to `Result<Version, String>`. When our result is an error, we `return Err(...)`, and when the result is a Version, we `return Ok(Version...)`. *How did you pass any tests if you had the wrong return type at the previous step!* This is hard, we'll get there eventually! [0]: https://crates.io/crates/lazy_static
-
version.1.summary
Humble beginnings Time to write some Rust code! It's all very basic for now, but we have to start somewhere. src/lib.rs ---------- Since this is a library crate, we list our public modules in `src/lib.rs`. For now, that's just our `version` module. While we're at it, let's pull in the `regex` crate we listed as a dependency. If we list it here, we can use it from any future modules, like the range parsing one we will make in the next step. src/version.rs -------------- There are three main components here. On the first line, we bring the `Regex` module into our local scope so we can refer to it with a short name. Next, we introduce a basic `Version` struct. If you're reading carefully, you'll see we don't even have the `pre` and `build` fields yet. Let's just start here and add those in later. We don't have to pass all the tests until we're done! Note that `Version` and its fields are all marked public. Unlike some other languages, in Rust you must mark each field of a struct public if you want to expose it. Our users will be able to use the struct and all its fields. The `parse_version` function uses `Regex` to extract the major, minor, and patch levels from a version string. This isn't robust enough to handle all of semver yet, but we have a mostly working module in under 20 lines of code. Are you worried about all those calls to `unwrap()`? You should be! If an error occurs, the `unwrap()` will panic and crash our program. We'll have to implement error handling at some point, but for now let's just keep things simple. *But the function is supposed to be `version::parse`, you called it `parse_version`!* Oops, we'll have to fix that in a future step.
-
start093f4443 · Base commit ·
-