MSMMS parsing buffer overflow
[AHA!](https://takeonme.org) has discovered an issue with Wireshark, and is issuing this disclosure in accordance with AHA!'s standard [disclosure policy](https://takeonme.org/cve.html) on 2023-05-16. CVE-2023-0667 has been assigned to this issue. Any questions about this disclosure should be directed to cve@takeonme.org. # Executive Summary Due to failure in validating the length provided by an attacker-crafted [MSMMS](https://wiki.wireshark.org/MSMMS.md) packet, Wireshark version 4.0.5 and prior, by default, is susceptible to a heap-based buffer overflow, and possibly code execution in the context of the process running Wireshark. CVE-2023-0667 appears to be an instance of CWE-122. # Technical Details On line 391, a command id is retrieved from the packet at 0x24. `/wireshark/epan/dissectors/packet-ms-mms.c` ``` 388 389 /* Read command ID and direction now so can give common command header a 390 descriptive label */ 391 command_id = tvb_get_letohs(tvb, 36); 392 command_dir = tvb_get_letohs(tvb, 36+2); 393 394 395 /*************************/ 396 /* Common command header */ ``` Then on line 441 a length is retrieved from the msmms packet payload. In our crash file, this length value is 0x4. `/wireshark/epan/dissectors/packet-ms-mms.c` ``` 436 /* Timestamp */ 437 proto_tree_add_item(msmms_common_command_tree, hf_msmms_command_timestamp, tvb, offset, 8, ENC_LITTLE_ENDIAN); 438 offset += 8; 439 440 /* Another length remaining field... */ 441 length_remaining = tvb_get_letohl(tvb, offset); 442 proto_tree_add_item(msmms_common_command_tree, hf_msmms_command_length_remaining2, tvb, offset, 4, ENC_LITTLE_ENDIAN); 443 offset += 4; 444 ``` Following, on line 471, the length is multiplied by 8, then 8 is subtracted from the new total leaving a value of 0x18 (24). `/wireshark/epan/dissectors/packet-ms-mms.c` ``` 461 /* Show summary in info column */ 462 col_append_fstr(pinfo->cinfo, COL_INFO, 463 "seq=%03u: %s %s", 464 sequence_number, 465 (command_dir == TO_SERVER) ? "-->" : "<--", 466 (command_dir == TO_SERVER) ? 467 val_to_str_const(command_id, to_server_command_vals, "Unknown") : 468 val_to_str_const(command_id, to_client_command_vals, "Unknown")); 469 470 /* Adjust length_remaining for command-specific details */ 471 length_remaining = (length_remaining*8) - 8; 472 ``` Following the length remaining calculation, the `command_id` retrieved earlier is used to determine the command type and on line 480, the `dissect_client_transport_info` is called `/wireshark/epan/dissectors/packet-ms-mms.c` ``` 473 /* Now parse any command-specific params */ 474 if (command_dir == TO_SERVER) 475 { 476 /* Commands to server */ 477 switch (command_id) 478 { 479 case SERVER_COMMAND_TRANSPORT_INFO: 480 dissect_client_transport_info(tvb, pinfo, msmms_tree, 481 offset, length_remaining); 482 break; ``` Entering the `dissect_client_transport_info` function, the `length_remaining` is 24 and the offset is 40. `/wireshark/epan/dissectors/packet-ms-mms.c` ``` 715 static void dissect_client_transport_info(tvbuff_t *tvb, packet_info *pinfo, proto_tree *tree, 716 guint offset, guint length_remaining) 717 { 718 char *transport_info; 719 guint ipaddr[4]; 720 char protocol[3+1] = ""; 721 guint port; 722 int fields_matched; 723 ``` On line 736, the `length_remaining` (at this point still equalling 24, has 20 subtracted from it, leaving 4 and then being passed as the length value to the `tvb_get_string_enc` function and the offset value equalling 60. `/wireshark/epan/dissectors/packet-ms-mms.c` ``` 734 735 /* Extract and show the string in tree and info column */ 736 transport_info = tvb_get_string_enc(pinfo->pool, tvb, offset, length_remaining - 20, ENC_UTF_16|ENC_LITTLE_ENDIAN); 737 738 proto_tree_add_string_format(tree, hf_msmms_command_client_transport_info, tvb, 739 offset, length_remaining-20, 740 transport_info, "Transport: (%s)", transport_info); 741 742 col_append_fstr(pinfo->cinfo, COL_INFO, " (%s)", 743 format_text(pinfo->pool, (guchar*)transport_info, length_remaining - 20)); 744 745 746 /* Try to extract details from this string */ 747 fields_matched = sscanf(transport_info, "%*c%*c%u.%u.%u.%u%*c%3s%*c%u", 748 &ipaddr[0], &ipaddr[1], &ipaddr[2], &ipaddr[3], 749 protocol, &port); ``` `tvb_get_utf_16_string` then calls `get_utf_16_string`, passing the length value (4). `/wireshark/epan/tvbuff.c` ``` 2840 static guint8 * 2841 tvb_get_utf_16_string(wmem_allocator_t *scope, tvbuff_t *tvb, const gint offset, gint length, const guint encoding) 2842 { 2843 const guint8 *ptr; 2844 2845 ptr = ensure_contiguous(tvb, offset, length); 2846 return get_utf_16_string(scope, ptr, length, encoding); 2847 } ``` Following the above, in `get_utf_16_string` on line 745, the `strbuf` variable is initialized through the `wmem_strbuf_new_sized` call. `/wireshark/epan/charsets.c` ``` 738 get_utf_16_string(wmem_allocator_t *scope, const guint8 *ptr, gint length, const guint encoding) 739 { 740 wmem_strbuf_t *strbuf; 741 gunichar2 uchar2, lead_surrogate; 742 gunichar uchar; 743 gint i; /* Byte counter for string */ 744 745 strbuf = wmem_strbuf_new_sized(scope, length+1); 746 747 for(i = 0; i + 1 < length; i += 2) { 748 if (encoding == ENC_BIG_ENDIAN) 749 uchar2 = pntoh16(ptr + i); 750 else 751 uchar2 = pletoh16(ptr + i); 752 ``` In the `wmem_strbuf_new_sized` function, the `strbuf` is initially allocated with an 0x10 size. `wireshark/wsutil/wmem/wmem_strbuf.c` ``` 30 wmem_strbuf_t * 31 wmem_strbuf_new_sized(wmem_allocator_t *allocator, 32 size_t alloc_size) 33 { 34 wmem_strbuf_t *strbuf; 35 36 strbuf = wmem_new(allocator, wmem_strbuf_t); 37 38 strbuf->allocator = allocator; 39 strbuf->len = 0; 40 strbuf->alloc_size = alloc_size ? alloc_size : DEFAULT_MINIMUM_SIZE; 41 42 strbuf->str = (gchar *)wmem_alloc(strbuf->allocator, strbuf->alloc_size); 43 strbuf->str[0] = '\0'; 44 45 return strbuf; 46 } ``` Then, back in `get_utf_16_string` function, we enter the for loop which helps determine the size of strbuf. In the attached crash this for loop is iterated through twice both times going through line 777 and hitting the `wmem_strbuf_append_unichar` function. `wireshark/epan/charsets.c` ``` 747 for(i = 0; i + 1 < length; i += 2) { 748 if (encoding == ENC_BIG_ENDIAN) 749 uchar2 = pntoh16(ptr + i); 750 else 751 uchar2 = pletoh16(ptr + i); 752 753 if (IS_LEAD_SURROGATE(uchar2)) { 754 /* 755 * Lead surrogate. Must be followed by 756 * a trail surrogate. 757 */ 758 i += 2; 759 if (i + 1 >= length) { 760 /* 761 * Oops, string ends with a lead surrogate. 762 * 763 * Insert a REPLACEMENT CHARACTER to mark the error, 764 * and quit. 765 */ 766 wmem_strbuf_append_unichar(strbuf, UNREPL); 767 break; 768 } 769 lead_surrogate = uchar2; 770 if (encoding == ENC_BIG_ENDIAN) 771 uchar2 = pntoh16(ptr + i); 772 else 773 uchar2 = pletoh16(ptr + i); 774 if (IS_TRAIL_SURROGATE(uchar2)) { 775 /* Trail surrogate. */ 776 uchar = SURROGATE_VALUE(lead_surrogate, uchar2); 777 wmem_strbuf_append_unichar(strbuf, uchar); 778 } else { ``` Each call to `wmem_strbuf_append_unichar` incrementing the `strbuf->len` value (leaving a length of 2). `wireshark/wsutil/wmem/wmem_strbuf.c ``` 234 wmem_strbuf_append_unichar(wmem_strbuf_t *strbuf, const gunichar c) 235 { 236 gchar buf[6]; 237 size_t charlen; 238 239 charlen = g_unichar_to_utf8(c, buf); 240 241 wmem_strbuf_grow(strbuf, charlen); 242 243 memcpy(&strbuf->str[strbuf->len], buf, charlen); 244 strbuf->len += charlen; 245 strbuf->str[strbuf->len] = '\0'; 246 } ``` Finally, at the bottom of the loop within `get_utf_16_string`, the `wmem_strbuf_finalize` function is called. `/wireshark/epan/charsets.c` ``` 804 } 805 806 /* 807 * If i < length, this means we were handed an odd number of bytes, 808 * so we're not a valid UTF-16 string; insert a REPLACEMENT CHARACTER 809 * to mark the error. 810 */ 811 if (i < length) 812 wmem_strbuf_append_unichar(strbuf, UNREPL); 813 return (guint8 *) wmem_strbuf_finalize(strbuf); 814 } ``` The `wmem_strbuf_finalize()` function then calls `wmem_realloc()` setting the size of the `strbuf->str` buffer to `strbuf->len`+1 (equalling 3) `wireshark/wsutil/wmem/wmem_strbuf.c` ``` 382 char * 383 wmem_strbuf_finalize(wmem_strbuf_t *strbuf) 384 { 385 if (strbuf == NULL) 386 return NULL; 387 388 char *ret = (char *)wmem_realloc(strbuf->allocator, strbuf->str, strbuf->len+1); 389 390 wmem_free(strbuf->allocator, strbuf); 391 392 return ret; 393 } ``` Following the above, a ptr to the buffer is returned to the `dissect_client_transport_info` function where it is later used with the length of 4 bytes allowing for a 1 byte read within the `format_text` function. Base64'd blob of the proof of concept. Note that this must be run with fuzzshark as tshark will not crash: ``` Ex4JdM76C7DO+guwTU1TDXr/igDv/9+KigEAAA0BAAAEAAAAAgADAAAAAAAAAAABNAAYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA2wAa29vb29vb29vbEz4JAc4H ``` # Attacker Value Passing the above blob to fuzzshark will trigger a heap overflow, and any crash in fuzzshark is necessarily is a bug in Wireshark library code, including the Wireshark GUI application and TShark, as they all hit the same code paths. Therefore, we're confident that a specially crafted MSMSS packet that implements this crash behavior is almost certainly exploitable, but we would like to have confirmation from the vendor on the exploitability before assigning this CVE ID. # Credit This issue is being disclosed through the AHA! CNA and is credited to: zenofex and WanderingGlitch # Timeline * 2023-04-27 (Wed): Initial findings presented at the regularly scheduled AHA! meeting. * 2023-05-17 (Wed): PoC validated and anaylsis completed for disclosure. * 2023-05-18 (Thu): Disclosed to the vendor via email at security@wireshark.org. * ... time marches on .... * 2023-07-17 (Mon): (Planned) Public disclosure at https://takeonme.org/cve/CVE-2023-0667.html ---- Reproducers: [cve-2023-0667.fuzz](/uploads/cf068db5c85322d4423f8a7a060c3d70/cve-2023-0667.fuzz) [cve-2023-0667.pcap](/uploads/5a66d90a4fa1de851183aba750714b5c/cve-2023-0667.pcap)
issue