[PATCH nft] parser_bison: on syntax errors, output expected tokens

Linux Netfilter development
 help / color / mirror / Atom feed

* [PATCH nft] parser_bison: on syntax errors, output expected tokens
@ 2025-12-04 21:54 Jan Kończak
  2025-12-04 22:37 ` Florian Westphal
  2026-01-16 13:07 ` Florian Westphal
  0 siblings, 2 replies; 4+ messages in thread
From: Jan Kończak @ 2025-12-04 21:54 UTC (permalink / raw)
  To: netfilter-devel

Now, on syntax erros, e.g., 'nft create fable filter', the user sees:
   Error: syntax error, unexpected string
   create fable filter
          ^^^^^
The patch builds an error message that lists what the parser expects
to see, in that case it would print:
   Error: syntax error, unexpected string
   expected any of: synproxy, table, chain, set, element, map,
   flowtable, ct, counter, limit, quota, secmark
   create fable filter
          ^^^^^
The obvious purpose of this is to help people who learn nft syntax.
The messages are still not as explanatory as one wishes, for it may
list parser token names such as 'string', but it's still better
than no hints at all.

Heed that the list of possible items on the parser's side is not
always consistent with expectations.
For instance, lexer/parser recognizes 'l4proto' in this command:
   nft add rule ip F I meta l4proto tcp
as a generic '%token <string> STRING', while 'iifname' in 
   nft add rule ip F I meta iifname eth0
is recognized as a '%token IIFNAME'
In such case the parser is only able to say that right after 'meta'
it expects 'iifname' or 'string', rather than 'iifname' and 'l4proto'.

(Notice that the help which is already present in nft is also off,
e.g., 'nft add rule ip F I meta bogus bogus' constructs in meta.c
a list of all possible keywords that does not list all possible
keywords, for 'ibriport' gets recognized, but is not listed there.)


Signed-off-by: Jan Kończak <jan.konczak@cs.put.poznan.pl>
---
 src/parser_bison.y | 68 +++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 67 insertions(+), 1 deletion(-)

diff --git a/src/parser_bison.y b/src/parser_bison.y
index 3ceef794..55e95028 100644
--- a/src/parser_bison.y
+++ b/src/parser_bison.y
@@ -221,7 +221,8 @@ int nft_lex(void *, void *, void *);
 %parse-param		{ void *scanner }
 %parse-param		{ struct parser_state *state }
 %lex-param		{ scanner }
-%define parse.error verbose
+%define parse.error custom
+%define parse.lac full
 %locations
 
 %initial-action {
@@ -6603,3 +6604,68 @@ exthdr_key		:	HBH	close_scope_hbh	{ $$ = IPPROTO_HOPOPTS; }
 			;
 
 %%
+
+static int
+yyreport_syntax_error(const yypcontext_t *yyctx, struct nft_ctx *nft,
+                      void *scanner, struct parser_state *state)
+{
+	struct location *loc = yypcontext_location(yyctx);
+	const char *badTok = yysymbol_name(yypcontext_token(yyctx));
+
+	char * msg;
+	int len = 1024;
+	int pos;
+	const char * const sep = ", ";
+
+	// get expected tokens
+	int expTokCnt = yypcontext_expected_tokens(yyctx, NULL, 0);
+	yysymbol_kind_t *expTokArr = malloc(sizeof(yysymbol_kind_t) * expTokCnt);
+	if (!expTokArr) return YYENOMEM;
+	yypcontext_expected_tokens(yyctx, expTokArr, expTokCnt);
+
+	// reserve space for the error message
+	msg = malloc(len);
+	if (!msg) { free(expTokArr); return YYENOMEM; }
+
+	// start building up the error message
+	pos = snprintf(msg, len, "syntax error, unexpected %s\n"
+	                         "expected any of: ", badTok);
+
+	// append expected tokens to the error message
+	for (int i = 0; i < expTokCnt; ++i) {
+		yysymbol_kind_t expTokKind = expTokArr[i];
+		const char * expTokName = yysymbol_name(expTokKind);
+
+		// tokens that name generic things shall be printed as <foo>; detect them
+		int isNotAKeyword = 0;
+		switch( expTokKind ){
+			case YYSYMBOL_NUM:      case YYSYMBOL_QUOTED_STRING:
+			case YYSYMBOL_STRING:   case YYSYMBOL_ASTERISK_STRING:
+				isNotAKeyword = 1;
+			default:
+		}
+
+		if ((size_t)len-pos-1 < strlen(expTokName)+strlen(sep)+isNotAKeyword*2) {
+			// need more space for the error message to fit all expected tokens
+			char * newMsg;
+			len += 1024;
+			newMsg = realloc(msg, len);
+			if (!newMsg) { free(msg); free(expTokArr); return YYENOMEM; }
+			msg = newMsg;
+		}
+
+		pos += snprintf(msg+pos, len-pos, "%s%s%s%s",
+		                                  (i ? sep : ""),
+		                                  (isNotAKeyword ? "<" : ""),
+		                                  expTokName,
+		                                  (isNotAKeyword ? ">" : ""));
+	}
+	free(expTokArr);
+	// notice no newline on the end of the error message; this is intended
+
+	// call the standard error function
+	yyerror(loc, nft, scanner, state, msg);
+
+	free(msg);
+	return 0;
+}
--



^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH nft] parser_bison: on syntax errors, output expected tokens
  2025-12-04 21:54 [PATCH nft] parser_bison: on syntax errors, output expected tokens Jan Kończak
@ 2025-12-04 22:37 ` Florian Westphal
  2025-12-13 12:32   ` Jan Kończak
  2026-01-16 13:07 ` Florian Westphal
  1 sibling, 1 reply; 4+ messages in thread
From: Florian Westphal @ 2025-12-04 22:37 UTC (permalink / raw)
  To: Jan Kończak; +Cc: netfilter-devel

Jan Kończak <jan.konczak@cs.put.poznan.pl> wrote:
> Now, on syntax erros, e.g., 'nft create fable filter', the user sees:
>    Error: syntax error, unexpected string
>    create fable filter
>           ^^^^^
> The patch builds an error message that lists what the parser expects
> to see, in that case it would print:
>    Error: syntax error, unexpected string
>    expected any of: synproxy, table, chain, set, element, map,
>    flowtable, ct, counter, limit, quota, secmark
>    create fable filter
>           ^^^^^

Thanks, this looks great.
Just a note that there might be slight delay with this getting applied
because we'd like to make a new release soon.

> Heed that the list of possible items on the parser's side is not
> always consistent with expectations.
> For instance, lexer/parser recognizes 'l4proto' in this command:
>    nft add rule ip F I meta l4proto tcp
> as a generic '%token <string> STRING', while 'iifname' in 
>    nft add rule ip F I meta iifname eth0
> is recognized as a '%token IIFNAME'
> In such case the parser is only able to say that right after 'meta'
> it expects 'iifname' or 'string', rather than 'iifname' and 'l4proto'.

Yes, thats fine.

We should move this to flex keywords, this STRING hack dates back to
the pre-flex-start-conditions era where adding new meta keywords might
have broken existing rulesets.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH nft] parser_bison: on syntax errors, output expected tokens
  2025-12-04 22:37 ` Florian Westphal
@ 2025-12-13 12:32   ` Jan Kończak
  0 siblings, 0 replies; 4+ messages in thread
From: Jan Kończak @ 2025-12-13 12:32 UTC (permalink / raw)
  To: netfilter-devel; +Cc: Florian Westphal

> Just a note that there might be slight delay with this getting
> applied because we'd like to make a new release soon.

Regarding the patch to dump all expected tokens on syntax errors in
'nft' command: should I modify it somehow so that it gets applied?
Or just wait, or you decided it's not a good idea after all?

In the meantime, I toyed with what parser expects as the next token,
and I see surprises here and there.
For example, is it intended that 'add' and 'rule' keywords are
optional? Right now, to add a rule, it suffices:
   nft F I tcp dport ssh accept

Plus, bogon hints from parser occur on parsing expressions such as
   nft add rule ip F I ct state ?
   nft add rule ip F I tcp dport ?
Because all these are parsed by the same subset of grammar rules
(I guess starting at 'relational_expr'), the possible tokens are
the same for any such expr.
By "bogon hints" I mean that commands get parsed successfully but
raise an error later on. E.g., 'nft F I ct state missing' yields
datatype mismatch claiming that missing is a boolean, not a state.

Interestingly, by "abusing" the junk token which always triggers
a syntax error it is possible to create a bash autocompletion script
that tries to execute nft command typed in so far but appended with
junk, and parses expected tokens into completions. I feel ambivalent
if it makes sense to build autocompletion this way, but it speeds up
checking what the parser expects.

Simple autocompletion (obviously requiring the patch) follows.
----------------------------
_nft(){
    expectedTokens=$(
        "${COMP_WORDS[@]:0:$COMP_CWORD}" $'\025' 2>&1 \
        | grep -A1 "unexpected junk"                   \
        | grep '^expected any of:'                      \
        | cut -d: -f2-                                   \
        | sed 's/, /\n/g'
    )
    [ "$expectedTokens" ] || return 1;
    EXPECTED=()   NONKEYWORD=()
    while read token; do
        [[ $token == '<'*'>'       ]] && { NONKEYWORD+=("$token"); continue; }
        [[ $token == "end of file" ]] && continue
        [[ $token == "newline"     ]] && continue 
        [[ $token == "colon"       ]] && { EXPECTED+=(':');   continue; }
        [[ $token == "semicolon"   ]] && { EXPECTED+=('\\;'); continue; }
        [[ $token == "comma"       ]] && { EXPECTED+=(',');   continue; }
        [[ $token == *[[:space:]]* ]] && { printf "\e[s\nautocompletion problem: space in token \"$token\"\n\e[u" 1>&2; continue; }
        EXPECTED+=("$token")
    done <<< "$expectedTokens"
    COMPREPLY=( $(compgen -W "${EXPECTED[*]}" -- ${COMP_WORDS[$COMP_CWORD]}) )
    if [[ $NONKEYWORD ]]; then
        # TODO: logic for values (non-keyword tokens); either call some
        # 'nft list …' to detect what shall be proposed here and add it to
        # COMPREPLY, or adjust COMPREPLY if the value is a new name / address
        # / port etc. that cannot be completed / guessed.

        if [[ ${COMP_LINE:$COMP_POINT-1:1} == [[:space:]] ]] ; then
            # append non-keyword placeholders after a whitespace character to
            # make them appear among completions, but be non-autocompletable
            for VAL in "${NONKEYWORD[@]}"; do
                COMPREPLY+=(" $VAL")
            done;
            # if there is only one possibility, then add an empty alternative
            # to prevent autocompleting the name of the non-keyword itself
            # (e.g. 'nft delete rule ip F T handle' expects only <number>)
            [[ ${#NONKEYWORD[@]} == 1 && ${#COMPREPLY[@]} == 1 ]] && COMPREPLY+=("")
        fi
    fi
    return 0;
}
complete -F _nft nft
----------------------------

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH nft] parser_bison: on syntax errors, output expected tokens
  2025-12-04 21:54 [PATCH nft] parser_bison: on syntax errors, output expected tokens Jan Kończak
  2025-12-04 22:37 ` Florian Westphal
@ 2026-01-16 13:07 ` Florian Westphal
  1 sibling, 0 replies; 4+ messages in thread
From: Florian Westphal @ 2026-01-16 13:07 UTC (permalink / raw)
  To: Jan Kończak; +Cc: netfilter-devel

Jan Kończak <jan.konczak@cs.put.poznan.pl> wrote:
> +static int
> +yyreport_syntax_error(const yypcontext_t *yyctx, struct nft_ctx *nft,
> +                      void *scanner, struct parser_state *state)
> +{
> +	struct location *loc = yypcontext_location(yyctx);
> +	const char *badTok = yysymbol_name(yypcontext_token(yyctx));
> +
> +	char * msg;
> +	int len = 1024;
> +	int pos;
> +	const char * const sep = ", ";
> +
> +	// get expected tokens
> +	int expTokCnt = yypcontext_expected_tokens(yyctx, NULL, 0);

No need for these comments, the function name below is verbose enough.

> +	yysymbol_kind_t *expTokArr = malloc(sizeof(yysymbol_kind_t) * expTokCnt);

You could use xmalloc_array() instead.

> +	if (!expTokArr) return YYENOMEM;
> +	yypcontext_expected_tokens(yyctx, expTokArr, expTokCnt);
> +
> +	// reserve space for the error message
> +	msg = malloc(len);

You can use xmalloc() here, like other parts in the parser already do.

> +	if (!msg) { free(expTokArr); return YYENOMEM; }

... and then this can be removed.

> +	// start building up the error message
> +	pos = snprintf(msg, len, "syntax error, unexpected %s\n"
> +	                         "expected any of: ", badTok);

I think it would be easier to switch this to fprintf, with
FILE *err = open_memstream(&msg,  ...).

> +	// append expected tokens to the error message
> +	for (int i = 0; i < expTokCnt; ++i) {
> +		yysymbol_kind_t expTokKind = expTokArr[i];
> +		const char * expTokName = yysymbol_name(expTokKind);
> +
> +		// tokens that name generic things shall be printed as <foo>; detect them
> +		int isNotAKeyword = 0;
> +		switch( expTokKind ){
> +			case YYSYMBOL_NUM:      case YYSYMBOL_QUOTED_STRING:
> +			case YYSYMBOL_STRING:   case YYSYMBOL_ASTERISK_STRING:
> +				isNotAKeyword = 1;
> +			default:
> +		}
> +
> +		if ((size_t)len-pos-1 < strlen(expTokName)+strlen(sep)+isNotAKeyword*2) {
> +			// need more space for the error message to fit all expected tokens
> +			char * newMsg;
> +			len += 1024;
> +			newMsg = realloc(msg, len);
> +			if (!newMsg) { free(msg); free(expTokArr); return YYENOMEM; }
> +			msg = newMsg;

... it would allow to remove these checks; libc would take care of
reallocting the memory buffer.

> +		pos += snprintf(msg+pos, len-pos, "%s%s%s%s",

and no more need to advance the target buffer and manual fiddling with
allowed length.

Other than these nits I think this is ready to get merged.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2026-01-16 13:07 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-12-04 21:54 [PATCH nft] parser_bison: on syntax errors, output expected tokens Jan Kończak
2025-12-04 22:37 ` Florian Westphal
2025-12-13 12:32   ` Jan Kończak
2026-01-16 13:07 ` Florian Westphal

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox