Skip to content
Snippets Groups Projects

Ticket18571 029 03

Closed David Goulet requested to merge ticket18571_029_03 into master

Initial review of the encoding part of the descriptor. Decoding is a work in progress.

Merge request reports

Closed by avatar (Apr 15, 2025 9:38pm UTC)

Loading

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
43 tor_assert(cert);
44 tor_assert(cert_str_out);
45
46 /* Get the encoded size and add the NULL byte. */
47 ed_cert_b64_len = base64_encode_size(cert->encoded_len,
48 BASE64_ENCODE_MULTILINE) + 1;
49 ed_cert_b64 = tor_malloc_zero(ed_cert_b64_len);
50
51 /* Base64 encode the encoded certificate. */
52 ret = base64_encode(ed_cert_b64, ed_cert_b64_len,
53 (const char *) cert->encoded, cert->encoded_len,
54 BASE64_ENCODE_MULTILINE);
55 if (ret < 0) {
56 log_err(LD_BUG, "Couldn't base64-encode descriptor signing key cert!");
57 goto err;
58 }
  • Oh man our base64 string API is really terrible. It's like a timebomb waiting for us to mess up.

    I don't like adding 1 to the value returned by a function that is supposed to give us the right length. I also don't like subtracting 1 when we want to check the retval of the base64_encode() function.

    Just in this patch, I see the following pattern three times:

    len = base64_encoded_size() + 1
    buf = tor_malloc_zero(len)
    ret = base64_encode()
    tor_assert(ret == len - 1)

    I wonder if we could functionify this somehow. I think this is especially worthy if we need to perform this pattern more times in the future.

  • See ticket #19531 about this.

  • asn
    asn @asn started a thread on the diff
  • 31 * section. */
    32 static const char *str_enc_hsdir_data = "hsdir-encrypted-data";
    33
    34 /* Encode an ed25519 certificate type and put the newly allocated string in
    35 * cert_str_out. Return 0 on success else a negative value. */
    36 STATIC int
    37 encode_cert(const tor_cert_t *cert, char **cert_str_out)
    38 {
    39 int ret;
    40 char *ed_cert_b64 = NULL;
    41 size_t ed_cert_b64_len;
    42
    43 tor_assert(cert);
    44 tor_assert(cert_str_out);
    45
    46 /* Get the encoded size and add the NULL byte. */
    • You probably mean NUL where you say NULL here. In the Tor codebase, we've been trying to consistently use NULL for the C pointer type, and NUL for the \x00 byte. It's a small distinction but important to get right.

      There are many occurences of this mixup on this patch.

      Edited by asn
    • Fixed them all in fixup commit.

  • asn
    asn @asn started a thread on the diff
  • 259 /* Build the secret input for the KDF computation. */
    260 build_secret_input(desc, secret_input, sizeof(secret_input));
    261
    262 xof = crypto_xof_new();
    263 /* Feed our KDF. [SHAKE it like a polaroid picture --Yawning]. */
    264 crypto_xof_add_bytes(xof, secret_input, sizeof(secret_input));
    265 crypto_xof_add_bytes(xof, salt, salt_len);
    266 crypto_xof_add_bytes(xof, (const uint8_t *) str_enc_hsdir_data,
    267 strlen(str_enc_hsdir_data));
    268 /* Eat from our KDF. */
    269 crypto_xof_squeeze_bytes(xof, key_out, key_out_len);
    270 crypto_xof_free(xof);
    271 memwipe(secret_input, 0, sizeof(secret_input));
    272 }
    273
    274 /* Given a buffer plaintext, pad it up to the encrypted section padding
  • asn
    asn @asn started a thread on the diff
  • 275 * requirement. Set the newly allocated string in padded_out and return the
    276 * length of the padded buffer. */
    277 static size_t
    278 build_plaintext_padding(const char *plaintext, size_t plaintext_len,
    279 uint8_t **padded_out)
    280 {
    281 size_t padded_len;
    282 uint8_t *padded;
    283
    284 tor_assert(plaintext);
    285 tor_assert(padded_out);
    286
    287 /* Compute the final length we need with padding. */
    288 padded_len = plaintext_len +
    289 (HS_DESC_ENCRYPTED_PADDING_PLAINTEXT_MULTIPLE -
    290 (plaintext_len % HS_DESC_ENCRYPTED_PADDING_PLAINTEXT_MULTIPLE));
    • Would be nice to abstract this calculation into its own function, or just unit test this function to get more confidence.

    • Fixed. I added a compute_padding_length() that does a cleaner version of this computation with some overflow check as well!

  • asn
    asn @asn started a thread on the diff
  • 327 encrypted = tor_malloc_zero(encrypted_len);
    328 /* This can't fail. */
    329 crypto_cipher_encrypt(cipher, (char *) encrypted,
    330 (const char *) padded_plaintext, encrypted_len);
    331 *encrypted_out = encrypted;
    332 /* Cleanup. */
    333 crypto_cipher_free(cipher);
    334 tor_free(padded_plaintext);
    335 return encrypted_len;
    336 }
    337
    338 /* Encrypt the given plaintext buffer and using the descriptor to get the
    339 * keys. Set encrypted_out with the encrypted data and return the length of
    340 * it. */
    341 static size_t
    342 encrypt_data(const hs_descriptor_t *desc, const char *plaintext,
  • asn
    asn @asn started a thread on the diff
  • 409 memcpy(final_blob + offset, mac, sizeof(mac));
    410 offset += sizeof(mac);
    411 /* Cleanup the buffers. */
    412 memwipe(salt, 0, sizeof(salt));
    413 memwipe(encrypted, 0, encrypted_len);
    414 tor_free(encrypted);
    415 /* Extra precaution. */
    416 tor_assert(offset == final_blob_len);
    417
    418 *encrypted_out = final_blob;
    419 return final_blob_len;
    420 }
    421
    422 /* Take care of encoding the encrypted data section from the plaintext
    423 * creation buffer to actually encrypting it with the descriptor's key. A
    424 * newly allocated NULL terminated string containing the encoded encrypted
  • asn
    asn @asn started a thread on the diff
  • 435 tor_assert(desc);
    436 tor_assert(encrypted_blob_out);
    437
    438 /* Build the start of the section prior to the introduction points. */
    439 {
    440 if (!desc->encrypted_data.create2_ntor) {
    441 log_err(LD_BUG, "HS desc doesn't have recognized handshake type.");
    442 goto err;
    443 }
    444 tor_asprintf(&line_str, "%s %d\n", str_create2_formats,
    445 ONION_HANDSHAKE_TYPE_NTOR);
    446 smartlist_add(lines, line_str);
    447
    448 if (desc->encrypted_data.auth_types &&
    449 smartlist_len(desc->encrypted_data.auth_types)) {
    450 /* Put the authentication-required line. */
  • asn
    asn @asn started a thread on the diff
  • 487 BASE64_ENCODE_MULTILINE);
    488 /* Return length doesn't count the NULL byte. */
    489 tor_assert(ret_len == (enc_b64_len - 1));
    490 tor_free(encrypted_blob);
    491 *encrypted_blob_out = enc_b64;
    492 }
    493 /* Success! */
    494 ret = 0;
    495
    496 err:
    497 SMARTLIST_FOREACH(lines, char *, l, tor_free(l));
    498 smartlist_free(lines);
    499 return ret;
    500 }
    501
    502 /* Encode the given descriptor desc and on success encoded_out points
  • asn
    asn @asn started a thread on the diff
  • 504 * descriptor.
    505 *
    506 * Return 0 on success and encoded_out is a valid pointer. On error, -1 is
    507 * returned and encoded_out is untouched. */
    508 int hs_desc_encode_descriptor(const hs_descriptor_t *desc,
    509 char **encoded_out)
    510 {
    511 int ret;
    512 char *encoded_str = NULL, *line_str;
    513 size_t encoded_len;
    514 smartlist_t *lines = smartlist_new();
    515
    516 tor_assert(desc);
    517 tor_assert(encoded_out);
    518
    519 /* Make sure we support the version of the descriptor format. */
    • Should we verify the correctness and consistency of hs_descriptor_t before encoding it?

      I see some verification happening here and there, but that makes me wonder why there is not more. For example, we check the descriptor version, the certificate type, the handshake type, etc. If we assume we have a verified descriptor when we get to this point, then we shouldn't need any verification right? Otherwise, if verification is needed, maybe we should make a function that verifies the descriptor completely?

      Or maybe the verification performed here and there is all the verification required?

      (e.g. It seems like there is no code that touches the subcredential field right now, but no one really notices even though it gets memcpy()ed. Should we?)

    • So the idea is that the service will call this function once the hs_descriptor_t has been populated. This is code that doesn't exists yet and that we'll add at the service implementation step. So, this function does basic verification indeed but maybe we should probably assume that the given descriptor has been validated prior. Maybe a validated flag would be useful in there and we assert on it thus allowing us to remove checks in there. We did something similar with the commit in prop250 code where it has to be valid when we encode it for publication.

      Subcreds are considered a bunch of bytes and we only use it in the kdf computation. It's assumed that it's fine and service did set it to 0 or with some valid stuff.

  • asn
    asn @asn started a thread on the diff
  • 524
    525 /* Build the non-encrypted values. */
    526 {
    527 char *encoded_cert;
    528 /* Encode certificate then create the first line of the descriptor. */
    529 if (desc->plaintext_data.signing_key_cert->cert_type
    530 != CERT_TYPE_HS_DESC_SIGN) {
    531 log_err(LD_BUG, "HS descriptor signing key has an unexpected cert type "
    532 "(%d)", (int) desc->plaintext_data.signing_key_cert->cert_type);
    533 ret = -1;
    534 goto err;
    535 }
    536 ret = encode_cert(desc->plaintext_data.signing_key_cert,
    537 &encoded_cert);
    538 if (ret < 0) {
    539 /* The function will print error logs. */
    • Hmm. I'm not sure if I like this pattern of "Return the retval of the function above" especially since we've been historically inconsistent about retvals in the tor codebase. I see this pattern a lot in this patch, but we don't actually do it in any other parts of tor AFAIK.

      I don't actually think there are any errors in this particular instance, but I would prefer to not mix the retvals of the called functions with the retval that we need to return from this function.

      Instead, I would prefer a more clean approach where we keep our own retval (by default set to -1), and if everything goes well we set it to 0 just before returning. If we need to check the return values of any called functions we should use a separate variable for those. As an example see rend_cache_store_v2_desc_as_service().

      Let me know if you think this is a silly idea.

    • Right I'm for setting explicitly ret to -1 on error which is what is documented to be returned on error. I'm not a big fan of setting a default value for ret because what I like is that if we forget to set ret, the compiler tells us but if it's defined in the first place, compiler misses.

      However, as long as we keep the pattern of only setting it to ret = 0 at the very end when all failing conditions pass, I like it. I think we shouldn't use the ret = -1 by default patterns if for instance we return in the middle of the function or we have different return code. When it's binary like this that is 0 or -1, I'm all for it.

      I've fixed most of the file to be consistent.

  • asn
    asn @asn started a thread on the diff
  • 530 != CERT_TYPE_HS_DESC_SIGN) {
    531 log_err(LD_BUG, "HS descriptor signing key has an unexpected cert type "
    532 "(%d)", (int) desc->plaintext_data.signing_key_cert->cert_type);
    533 ret = -1;
    534 goto err;
    535 }
    536 ret = encode_cert(desc->plaintext_data.signing_key_cert,
    537 &encoded_cert);
    538 if (ret < 0) {
    539 /* The function will print error logs. */
    540 goto err;
    541 }
    542 /* Create the hs descriptor line. */
    543 tor_asprintf(&line_str, "%s %" PRIu32, str_hs_desc,
    544 desc->plaintext_data.version);
    545 smartlist_add(lines, line_str);
    • OK, I think this might be a good time to let you know about smartlist_add_asprintf() :)

      I see the pattern tor_asprintf() ; smartlist_add() about 20 times in this patch, so using that specialized function might actually decrease the size of your patch by a a bit :portable_stereo:

    • Amazing, will fix all!

  • asn
    asn @asn started a thread on the diff
  • 57 /* Introduction point information located in a descriptor. */
    58 typedef struct hs_desc_intro_point_t {
    59 /* Link specifier(s) which details how to extend to the relay. This list
    60 * contains hs_desc_link_specifier_t object. It MUST have at least one. */
    61 smartlist_t *link_specifiers;
    62
    63 /* Authentication key used to establish the introduction point circuit and
    64 * cross-certifies the blinded public key for the replica. */
    65 tor_cert_t *auth_key_cert;
    66
    67 /* Encryption key used to encrypt request to hidden service. */
    68 curve25519_keypair_t enc_key;
    69
    70 /* Backward compat: RSA 1024 encryption key for legacy purposes.
    71 * Mutually exclusive with enc_key. */
    72 crypto_pk_t *enc_key_legacy;
    • Hmm, I see we used a union in the data structure above? Maybe we could put the two enc_keys in a union as well? IIRC they are mutually exclusive.

    • Great idea. I've created this so we access encryption keys with some semantic like: ip->enc_key.legacy or ip->enc_key.curve25519.

        union {
          /* Encryption key used to encrypt request to hidden service. */
          curve25519_keypair_t curve25519;
      
          /* Backward compat: RSA 1024 encryption key for legacy purposes.
           * Mutually exclusive with enc_key. */
          crypto_pk_t *legacy;
        } enc_key;

      and it also adds a key type in there so we know which one to use!

      Edited by David Goulet
  • asn
    asn @asn started a thread on the diff
  • 99 /* Certificate with the short-term ed22519 descriptor signing key for the
    100 * replica which is signed by the blinded public key for that replica. */
    101 tor_cert_t *signing_key_cert;
    102
    103 /* Signing keypair which is used to sign the descriptor. Same public key
    104 * as in the signing key certificate. */
    105 ed25519_keypair_t signing_kp;
    106
    107 /* Blinded keypair used for this descriptor derived from the master
    108 * identity key and generated for a specific replica number. */
    109 ed25519_keypair_t blinded_kp;
    110
    111 /* Revision counter is incremented at each upload, regardless of whether
    112 * the descriptor has changed. This avoids leaking whether the descriptor
    113 * has changed. Spec specifies this as a 4 bytes positive integer. */
    114 uint32_t revision_counter;
    • Hmm, where is it defined that the revision_counter is 4 bytes? Can't find it in prop224.

      If anything, I see it being applied with INT_8 which would hint to a uint64_t.

      I'd personally go with 64-bits as well, so that in the future we can do randomization tricks like the one that teor tried to pull off.

    • Agree. I really thought that was a 64 bit int! Good find.

  • Added prop224 label

  • David Goulet Status changed to closed

    Status changed to closed

  • Please register or sign in to reply
    Loading