- 02 Jul, 2022 1 commit
-
-
João Valverde authored
This removes unparsed name resolution during the semantic check because it feels like a hack to work around limitations in the language syntax, that should be solved at the lexical level instead. We were interpreting unparsed differently on the LHS and RHS. Now an unparsed value is always a field if it matches a registered field name (this matches the implementation in 3.6 and before). This requires tightening a bit the allowed filter names for protocols to avoid some common and potentially weird conflicting cases. Incidentally this extends set grammar to accept all entities. That is experimental and may be reverted in the future.
-
- 25 Jun, 2022 1 commit
-
-
João Valverde authored
This adds support for using the layers filter with field references. Before: $ dftest 'ip.src != ${ip.src#2}' dftest: invalid character in macro name After: $ dftest 'ip.src != ${ip.src#2}' Filter: ip.src != ${ip.src#2} Syntax tree: 0 TEST_ALL_NE: 1 FIELD(ip.src <FT_IPv4>) 1 REFERENCE(ip.src#[2:1] <FT_IPv4>) Instructions: 00000 READ_TREE ip.src <FT_IPv4> -> reg#0 00001 IF_FALSE_GOTO 5 00002 READ_REFERENCE_R ${ip.src <FT_IPv4>} #[2:1] -> reg#1 00003 IF_FALSE_GOTO 5 00004 ALL_NE reg#0 != reg#1 00005 RETURN This requires adding another level of complexity to references. When loading references we need to copy the 'proto_layer_num' and add the logic to filter on that. The "layer" sttype is removed and replace by a new field sttype with support for a range. This is a nice cleanup for the semantic check and general simplification. The grammar is better too with this design. Range sttype is renam...
-
- 18 Apr, 2022 1 commit
-
-
João Valverde authored
This allows writing moderately complex expressions, for example a float epsilon test (#16483): Filter: {abs(_ws.ftypes.double - 1) / max(abs(_ws.ftypes.double), abs(1))} < 0.01 Syntax tree: 0 TEST_LT: 1 OP_DIVIDE: 2 FUNCTION(abs#1): 3 OP_SUBTRACT: 4 FIELD(_ws.ftypes.double) 4 FVALUE(1 <FT_DOUBLE>) 2 FUNCTION(max#2): 3 FUNCTION(abs#1): 4 FIELD(_ws.ftypes.double) 3 FUNCTION(abs#1): 4 FVALUE(1 <FT_DOUBLE>) 1 FVALUE(0.01 <FT_DOUBLE>) Instructions: 00000 READ_TREE _ws.ftypes.double -> reg#1 00001 IF_FALSE_GOTO 3 00002 SUBRACT reg#1 - 1 <FT_DOUBLE> -> reg#2 00003 STACK_PUSH reg#2 00004 CALL_FUNCTION abs(reg#2) -> reg#0 00005 STACK_POP 1 00006 IF_FALSE_GOTO 24 00007 READ_TREE _ws.ftypes.double -> reg#1 00008 IF_FALSE_GOTO 9 00009 STACK_PUSH reg#1 00010 CALL_FUNCTION abs(reg#1) -> reg#4 00011 STACK_POP 1 00012 IF_FALSE_GOTO 13 00013 STACK_PUSH reg#4 00014 STACK_PUSH 1 <FT_DOU...
-
- 14 Apr, 2022 1 commit
-
-
Changes the function calling convention to pass the first register number plus the number of registers after that sequentially. This allows function with any number of arguments. Functions can still only return one value. Adds max() and min() function to select the maximum/minimum value from any number of arguments, all of the same type. The functions accept literals too. The return type is the same as the first argument (cannot be a literal).
-
- 12 Apr, 2022 1 commit
-
-
Instead of using a heuristic to decide whether the form ${...} is a macro or not, try to resolve the name to a registered protocol field and use that instead. This increases somewhat the surface for clobbering existing macro names with new field registrations but we'll cross that bridge when we get to it. Rejecting protocol field types reduces this probability again but it may not be intuitive to the user trying to mistakenly use a reference to a protocol why it is parsed as a macro. The reasons for rejecting FT_PROTOCOL types as not interesting field references are not very strong but it seems reasonable. $ dftest 'frame.number != ${frame.number}' Filter: frame.number != ${frame.number} Instructions: 00000 READ_TREE frame.number -> reg#0 00001 IF_FALSE_GOTO 5 00002 READ_REFERENCE ${frame.number} -> reg#1 00003 IF_FALSE_GOTO 5 00004 ALL_NE reg#0 != reg#1 00005 RETURN $ dftest 'frame != ${frame}' dftest: macro 'frame' does not exist
-
- 11 Apr, 2022 1 commit
-
-
If we don't have an offset, don't print anything with underline. Also it can underline filters using macros correctly now. $ tshark -Y 'ip and ${private_ipv4:ip.sr}' -r /dev/null tshark: Left side of "==" expression must be a field or function, not "ip.sr". ip and ip.sr == 192.168.0.0/16 or ip.sr == 172.16.0.0/12 or ip.sr == 10.0.0.0/8 ^~~~~
-
- 10 Apr, 2022 2 commits
-
-
João Valverde authored
Add location tracking as a column offset and length from offset to the scanner. Our input is a single line only so we don't need to track line offset. Record that information in the syntax tree. Return the error location in dfilter_compile(). Use it in dftest to mark the location of the error in the filter string. Later it would be nice to use the location in the GUI as well. $ dftest "ip.proto == aaaaaa and tcp.port == 123" Filter: ip.proto == aaaaaa and tcp.port == 123 dftest: "aaaaaa" cannot be found among the possible values for ip.proto. ip.proto == aaaaaa and tcp.port == 123 ^~~~~~
-
João Valverde authored
Revert to passing a syntax node from the lexical scanner to the grammar parser. Using a union is not having a discernible advantage and requires duplicating a lot of properties of syntax nodes.
-
- 06 Apr, 2022 2 commits
-
-
João Valverde authored
-
João Valverde authored
-
- 05 Apr, 2022 1 commit
-
-
João Valverde authored
Add argument to dfilter_compile_real() to save syntax tree text representation. Use it with dftest to print syntax tree. Misc debug output format improvements.
-
- 31 Mar, 2022 1 commit
-
-
João Valverde authored
-
- 30 Mar, 2022 1 commit
-
-
João Valverde authored
By the time we are using the reference fvalue the tree may have gone away and with it the fvalue. We need to duplicate the reference fvalues and take ownership of the memory.
-
- 29 Mar, 2022 1 commit
-
-
This replaces the current macro reference system with a completely different implementation. Instead of a macro a reference is a syntax element. A reference is a constant that can be filled in the dfilter code after compilation from an existing protocol tree. It is best understood as a field value that can be read from a fixed tree that is not the frame being filtered. Usually this fixed tree is the currently selected frame when the filter is applied. This allows comparing fields in the filtered frame with fields in the selected frame. Because the field reference syntax uses the same sigil notation as a macro we have to use a heuristic to distinguish them: if the name has a dot it is a field reference, otherwise it is a macro name. The reference is synctatically validated at compile time. There are two main advantages to this implementation (and a couple of minor ones): The protocol tree for each selected frame is only walked if we have a display filter and if t...
-
- 28 Mar, 2022 3 commits
-
-
João Valverde authored
-
João Valverde authored
-
This change implements a unary minus operator. Filter: tcp.window_size_scalefactor == -tcp.dstport Instructions: 00000 READ_TREE tcp.window_size_scalefactor -> reg#0 00001 IF_FALSE_GOTO 6 00002 READ_TREE tcp.dstport -> reg#1 00003 IF_FALSE_GOTO 6 00004 MK_MINUS -reg#1 -> reg#2 00005 ANY_EQ reg#0 == reg#2 00006 RETURN It is supported for integer types, floats and relative time values. The unsigned integer types are promoted to a 32 bit signed integer. Unary plus is implemented as a no-op. The plus sign is simply ignored. Constant arithmetic expressions are computed during compilation. Overflow with constants is a compile time error. Overflow with variables is a run time error and silently ignored. Only a debug message will be printed to the console. Related to #15504.
-
- 22 Mar, 2022 1 commit
-
-
Add support for masking of bits. Before the bitwise operator could only test bits, it did not support clearing bits. This allows testing if any combination of bits are set/unset more naturally with a single test. Previously this was only possible by combining several bitwise predicates. Bitwise is implemented as a test node, even though it is not. Maybe the test node should be renamed to something else. Fixes #17246.
-
- 21 Mar, 2022 3 commits
-
-
-
João Valverde authored
-
João Valverde authored
Replace calls to list append with list prepend where applicable.
-
- 05 Mar, 2022 2 commits
-
-
For an expression starting with a colon (a literal) try to parse the value with and without colon. This avoids excluding some valid representations like the IPv6 address "::1".
-
The syntax for protocols and some literals like numbers and bytes/addresses can be ambiguous. Some protocols can be parsed as a literal, for example the protocol "fc" (Fibre Channel) can be parsed as 0xFC. If a numeric protocol is registered that will also take precedence over any literal, according to the current rules, thereby breaking numerical comparisons to that number. The same for an hypothetical protocol named "true", etc. To allow the user to disambiguate this meaning introduce new syntax. Any value prefixed with ':' or enclosed in <,> will be treated as a literal value only. The value :fc or <fc> will always mean 0xFC, under any context. Never a protocol whose filter name is "fc". Likewise any value prefixed with a dot will always be parsed as an identifier (protocol or protocol field) in the language. Never any literal value parsed from the token "fc". This allows the user to be explicit about the meaning, and between the two explicit methods plus the ambiguous one it doesn't completely break any one meaning. The difference can be seen in the following two programs: Filter: frame == fc Constants: Instructions: 00000 READ_TREE frame -> reg#0 00001 IF-FALSE-GOTO 5 00002 READ_TREE fc -> reg#1 00003 IF-FALSE-GOTO 5 00004 ANY_EQ reg#0 == reg#1 00005 RETURN -------- Filter: frame == :fc Constants: 00000 PUT_FVALUE fc <FT_PROTOCOL> -> reg#1 Instructions: 00000 READ_TREE frame -> reg#0 00001 IF-FALSE-GOTO 3 00002 ANY_EQ reg#0 == reg#1 00003 RETURN The filter "frame == fc" is the same as "filter == .fc", according to the current heuristic, except the first form will try to parse it as a literal if the name does not correspond to any registered protocol. By treating a leading dot as a name in the language we necessarily disallow writing floats with a leading dot. We will also disallow writing with an ending dot when using unparsed values. This is a backward incompatibility but has the happy side effect of making the expression {1...2} unambiguous. This could either mean "1 .. .2" or "1. .. 2". If we require a leading and ending digit then the meaning is clear: 1.0..0.2 -> 1.0 .. 0.2 Fixes #17731.
-
- 22 Dec, 2021 1 commit
-
-
To complete the set of equality operators add an "all equal" operator that matches a frame if all fields match the condition. The symbol chosen for "all_eq" is "===".
-
- 19 Dec, 2021 1 commit
-
-
Replace: g_snprintf() -> snprintf() g_vsnprintf() -> vsnprintf() g_strdup_printf() -> ws_strdup_printf() g_strdup_vprintf() -> ws_strdup_vprintf() This is more portable, user-friendly and faster on platforms where GLib does not like the native I/O. Adjust the format string to use macros from intypes.h.
-
- 13 Dec, 2021 1 commit
-
-
TEST_EQ and TEST_NE are unused. Replace by the correct values and add missing token to string representations.
-
- 12 Dec, 2021 1 commit
-
-
João Valverde authored
-
- 01 Dec, 2021 1 commit
-
-
João Valverde authored
Store the lval token value instead.
-
- 30 Nov, 2021 1 commit
-
-
João Valverde authored
Instead of requiring a special error function in the parser just set the syntax_error flag if an error occurs, in any stage of compilation. Outside of the parser loop it will not be used but that is fine.
-
- 16 Nov, 2021 1 commit
-
-
João Valverde authored
Add result output to console log, in addition to intermediate debug information. This allows tracing the result using the log only.
-
- 10 Nov, 2021 1 commit
-
-
João Valverde authored
Misc code cleanups. Add some extra stnode functions for increased type safety. Fix a constness issue with df_lval_value().
-
- 06 Nov, 2021 2 commits
-
-
- 31 Oct, 2021 1 commit
-
-
João Valverde authored
-
- 27 Oct, 2021 3 commits
-
-
João Valverde authored
Currently unused. This might still be useful to differentiate different spelling of the same token in user messages, like "==" and "eq", but currently we are not storing test tokens anyway, so just remove it, it makes everything simpler. If it's ever necessary it can be added back.
-
João Valverde authored
Minor change to decouple the AST data structures from the lexical scanner. We pass a structure to allow for some future enhancements.
-
Using a hand written tokenizer is simpler than using flex start conditions. Do the validation in the drange node constructor. Add validation for malformed ranges with different endpoint signs.
-
- 18 Oct, 2021 2 commits
-
-
-
Should be more obvious that this error is caused by a string syntax error and not something else.
-
- 17 Oct, 2021 1 commit
-
-
Matches is a special case that looks on the RHS and tries to convert every unparsed value to a string, regardless of the LHS type. This is not how types work in the display filter. Require double-quotes to avoid ambiguity, because matches doesn't follow normal Wireshark display filter type rules. It doesn't need nor benefit from the flexibility provided by unparsed strings in the syntax. For matches the RHS is always a literal strings except if the RHS is also a field name, then it complains of an incompatible type. This is confusing. No type can be compatible because no type rules are ever considered. Every unparsed value is a text string except if it happens to coincide with a field name it also requires double-quoting or it throws a syntax error, just to be difficult. We could remove this odd quirk but requiring double-quotes for regular expressions is a better, more elegant fix. Before: Filter: tcp matches "udp" Constants: 00000 PUT_PCRE udp -> reg#1 Instructions: 00000 READ_TREE tcp -> reg#0 00001 IF-FALSE-GOTO 3 00002 ANY_MATCHES reg#0 matches reg#1 00003 RETURN Filter: tcp matches udp Constants: 00000 PUT_PCRE udp -> reg#1 Instructions: 00000 READ_TREE tcp -> reg#0 00001 IF-FALSE-GOTO 3 00002 ANY_MATCHES reg#0 matches reg#1 00003 RETURN Filter: tcp matches udp.srcport dftest: tcp and udp.srcport are not of compatible types. Filter: tcp matches udp.srcportt Constants: 00000 PUT_PCRE udp.srcportt -> reg#1 Instructions: 00000 READ_TREE tcp -> reg#0 00001 IF-FALSE-GOTO 3 00002 ANY_MATCHES reg#0 matches reg#1 00003 RETURN After: Filter: tcp matches "udp" Constants: 00000 PUT_PCRE udp -> reg#1 Instructions: 00000 READ_TREE tcp -> reg#0 00001 IF-FALSE-GOTO 3 00002 ANY_MATCHES reg#0 matches reg#1 00003 RETURN Filter: tcp matches udp dftest: "udp" was unexpected in this context. Filter: tcp matches udp.srcport dftest: "udp.srcport" was unexpected in this context. Filter: tcp matches udp.srcportt dftest: "udp.srcportt" was unexpected in this context. The error message could still be improved.
-