Skip to content

Ui: Regex based textfile / hexdump importing

Paul Weiss requested to merge paulniklasweiss/wireshark:regex_text_import into master

An addition to the "Import from hexdump" feature to import arbitrary dump-formats as long as they can be specified using a regex and the encoding is supported. I did my best to not mess with he original behaviour, but cannot really test this since I cant find a refrence for this format.

The regex parser besides the data field in each packet supports time, seqno and dir and sets the values according to the strings captured by these groups. Supported encodings right now are plain-binary, -octal, -hexadecimal, -base64. Plain here means no prefix like 0x. Whitespaces and ':' are ignored.

The only (intentional) modification to the original parser is that it now also supports the %f for fractions of seconds in strptime(3) and no longer just uses whatever follows the the timestamp.

The only change affecting other sources is the compiler option -Wno-override-init enabled for C files in the top makefile. This is necessary to initiallize the encoding translation tables to INVALID_VALUE (!= 0) and then overwrite the valid entries with the appropriate values. The alternative is manually counting the used values and inserting c.a. 200 INVALID_VALUEs between them. This should probably only affect the file text_import.c but I dont know how to tell cmake this

As an example how this can be used, assume a script dumped the following traffic:

> 0:00:00.265620 a130368b00000008006a88f86a0beaffe6f0b7faae9c295c9bb3f3ffeff0aefa36b2ff03c79f0000240c002002d005f97630
> 0:00:00.280836 a1216c8b00000000000089086b0b82020407
< 0:00:00.295459 a201080000000000000000080000000000000000000000000000000000000000000000000000000000000000000000000000
> 0:00:00.296982 a1303c8b00000008007088286b0bc1ffcbf0f9f
> 0:00:00.305644 a121718b0000000000008ba86a0b800800000000000000000000000000000000000000000000000000000000000000000000
< 0:00:00.319061 a2010900000000000000001000600000
> 0:00:00.330937 a130428b00000008007589186b0bb9ffd9f0fdfa3eb4295e99f3aaffd2f005fb54b297fdb81f0000220c0020ff6700828860
> 0:00:00.356037 a121788b0000000000008a186b0a901000600000100000584357313031313538323138313000000000000000000000000000

By specifying the format using a regex, a timestamp, the direction chars and the encoding:

regex: ^(?<dir>[<>])\s(?<time>\d+:\d\d:\d\d.\d+)\s(?<data>[0-9a-fA-F])$
timestamp: %H:%M:%S.%f
dir in: <
dir out: >
encoding: HEX

This dump can be imported into wireshark.

Edited by Paul Weiss

Merge request reports