All of lore.kernel.org
 help / color / mirror / Atom feed
From: Markus Armbruster <armbru@redhat.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: qemu-devel@nongnu.org
Subject: Re: [PATCH 4/6] json-streamer: remove token queue
Date: Mon, 29 Jun 2026 15:02:52 +0200	[thread overview]
Message-ID: <878q7xts77.fsf@pond.sub.org> (raw)
In-Reply-To: <20260626101727.1727389-5-pbonzini@redhat.com> (Paolo Bonzini's message of "Fri, 26 Jun 2026 12:17:24 +0200")

Paolo Bonzini <pbonzini@redhat.com> writes:

> Now fully exploit the push parser, feeding it one token at a time
> without having to wait until braces and brackets are balanced.
>
> While the nesting counts are retained for error recovery purposes,
> the system can now report the first parsing error without waiting
> for parentheses to be balanced.  This also means that JSON_ERROR
> can be handled in json-parser.c, not json-streamer.c.
>
> After reporting the error, json-streamer.c then enters an error recovery
> mode where subsequent errors are suppressed.  This mimics the previous
> error reporting behavior, but it provides prompt feedback on parsing
> errors.  As an example, here is an example interaction with qemu-ga.
>
> BEFORE (error reported only once braces are balanced):
>
>    >> {"execute":foo
>    >> }
>    << {"error": {"class": "GenericError", "desc": "JSON parse error, invalid keyword 'foo'"}}
>    >> {"execute":"somecommand"}
>    << {"error": {"class": "CommandNotFound", "desc": "The command somecommand has not been found"}}
>
> AFTER (error reported immediately, but similar error recovery as before):
>
>    >> {"execute":foo
>    << {"error": {"class": "GenericError", "desc": "JSON parse error, invalid keyword 'foo'"}}
>    >> }
>    >> {"execute":"somecommand"}
>    << {"error": {"class": "CommandNotFound", "desc": "The command somecommand has not been found"}}
>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  include/qobject/json-parser.h |   3 +-
>  qobject/json-parser.c         |   4 ++
>  qobject/json-streamer.c       | 106 +++++++++++++---------------------
>  3 files changed, 47 insertions(+), 66 deletions(-)
>
> diff --git a/include/qobject/json-parser.h b/include/qobject/json-parser.h
> index 0cf6932ecdc..3479e637588 100644
> --- a/include/qobject/json-parser.h
> +++ b/include/qobject/json-parser.h
> @@ -33,7 +33,8 @@ typedef struct JSONMessageParser {
>      JSONParserContext parser;
>      unsigned int brace_count;
>      unsigned int bracket_count;
> -    GQueue tokens;
> +    unsigned int token_count;
> +    bool error;
>      uint64_t token_size;
>  } JSONMessageParser;
>  
> diff --git a/qobject/json-parser.c b/qobject/json-parser.c
> index 845da3699aa..484956deae4 100644
> --- a/qobject/json-parser.c
> +++ b/qobject/json-parser.c
> @@ -673,6 +673,10 @@ QObject *json_parser_feed(JSONParserContext *ctxt, const JSONToken *token,
>  
>      assert(!ctxt->err);
>      switch (token->type) {
> +    case JSON_ERROR:
> +        parse_error(ctxt, token, "stray '%s'", token->str);
> +        break;
> +
>      case JSON_END_OF_INPUT:
>          /* Check for premature end of input */
>          if (!g_queue_is_empty(ctxt->stack)) {
> diff --git a/qobject/json-streamer.c b/qobject/json-streamer.c
> index 9e1f650bad8..9526f815f00 100644
> --- a/qobject/json-streamer.c
> +++ b/qobject/json-streamer.c
> @@ -1,5 +1,5 @@
>  /*
> - * JSON streaming support
> + * JSON parser - callback interface and error recovery
>   *
>   * Copyright IBM, Corp. 2009
>   *
> @@ -19,23 +19,16 @@
>  #define MAX_TOKEN_COUNT (2ULL << 20)
>  #define MAX_NESTING (1 << 10)
>  
> -static void json_message_free_tokens(JSONMessageParser *parser)
> -{
> -    JSONToken *token;
> -
> -    while ((token = g_queue_pop_head(&parser->tokens))) {
> -        g_free(token);
> -    }
> -}
> -
>  void json_message_process_token(JSONLexer *lexer, GString *input,
>                                  JSONTokenType type, int x, int y)
>  {
>      JSONMessageParser *parser = container_of(lexer, JSONMessageParser, lexer);
> -    QObject *json = NULL;
>      Error *err = NULL;
> -    JSONToken *token;
>  
> +    parser->token_size += input->len;
> +    parser->token_count++;
> +
> +    /* Detect message boundaries for error recovery purposes.  */
>      switch (type) {
>      case JSON_LCURLY:
>          parser->brace_count++;
> @@ -56,19 +49,9 @@ void json_message_process_token(JSONLexer *lexer, GString *input,
>          parser->bracket_count--;
>          break;
>      case JSON_ERROR:
> -        error_setg(&err, "JSON parse error, stray '%s'", input->str);
> -        goto out_emit;
> -    case JSON_END_OF_INPUT:
> -        /*
> -         * Force the parentheses to appear balanced and the queue
> -         * to be emptied, causing a parse error if it wasn't.
> -         */
> -        if (g_queue_is_empty(&parser->tokens)) {
> -            return;
> -        }
>      end_error_recovery:
>          /*
> -         * We goto here due to receiving either JSON_ERROR or a
> +         * We come here due to receiving either JSON_ERROR or a

Line was added in the previous commit.  Squash the change into it?

>           * JSON_R{CURLY,SQUARE}) that is known to be unbalanced.
>           * If in error recovery, end it immediately.  If not in
>           * error recovery, json_parser_feed() will raise an error
> @@ -81,49 +64,43 @@ void json_message_process_token(JSONLexer *lexer, GString *input,
>          break;
>      }
>  
> -    /*
> -     * Security consideration, we limit total memory allocated per object
> -     * and the maximum recursion depth that a message can force.
> -     */
> -    if (parser->token_size + input->len + 1 > MAX_TOKEN_SIZE) {

Left operand of > is unincremented token_size plus increment plus 1.

> -        error_setg(&err, "JSON token size limit exceeded");
> -        goto out_emit;
> -    }
> -    if (g_queue_get_length(&parser->tokens) + 1 > MAX_TOKEN_COUNT) {
> -        error_setg(&err, "JSON token count limit exceeded");
> -        goto out_emit;
> -    }
> -    if (parser->bracket_count + parser->brace_count > MAX_NESTING) {
> -        error_setg(&err, "JSON nesting depth limit exceeded");
> -        goto out_emit;
> -    }
> +    if (parser->error) {
> +        /* error recovery, eat tokens until parentheses balance */
> +    } else {
> +        /*
> +         * Safety consideration, we limit total memory allocated per object
> +         * and the maximum nesting depth that a message can force.
> +         */
> +        if (parser->token_size > MAX_TOKEN_SIZE) {

Left operand of > is incremented token size.

I believe this is one less than before the patch.  Testing...  yes:

    -blockdev '{"a":"01234567890123456789012345678901234567890123456789012345"}'

with MAX_TOKEN_SIZE hacked to 64: is rejected before the series, and
accepted afterwards.

Obvious fix: change > to >=.

If you'd prefer not to change the code, mention the change in the commit
message.

> +            error_setg(&err, "JSON token size limit exceeded");
> +        } else if (parser->token_count > MAX_TOKEN_COUNT) {
> +            error_setg(&err, "JSON token count limit exceeded");
> +        } else if (parser->bracket_count + parser->brace_count > MAX_NESTING) {
> +            error_setg(&err, "JSON nesting depth limit exceeded");
> +        } else {
> +            g_autofree JSONToken *token = json_token(type, x, y, input);
> +            QObject *json = json_parser_feed(&parser->parser, token, &err);
> +            if (json) {
> +                parser->emit(parser->opaque, json, NULL);
> +            }
> +        }
>  
> -    token = json_token(type, x, y, input);
> -    parser->token_size += input->len;
> -
> -    g_queue_push_tail(&parser->tokens, token);
> -
> -    if (parser->brace_count > 0 || parser->bracket_count > 0) {
> -        return;
> -    }
> -
> -    /* Process all tokens in the queue */
> -    while (!g_queue_is_empty(&parser->tokens)) {
> -        token = g_queue_pop_head(&parser->tokens);
> -        json = json_parser_feed(&parser->parser, token, &err);
> -        g_free(token);
> -        if (json || err) {
> -            break;
> +        if (err) {
> +            parser->emit(parser->opaque, NULL, err);
> +            /* start recovery */
> +            parser->error = true;
>          }
>      }
>  
> -out_emit:
> -    json_parser_reset(&parser->parser);
> -    parser->brace_count = 0;
> -    parser->bracket_count = 0;
> -    json_message_free_tokens(parser);
> -    parser->token_size = 0;
> -    parser->emit(parser->opaque, json, err);
> +    if ((parser->brace_count == 0 && parser->bracket_count == 0)
> +        || type == JSON_END_OF_INPUT) {
> +        json_parser_reset(&parser->parser);
> +        parser->error = false;
> +        parser->brace_count = 0;
> +        parser->bracket_count = 0;
> +        parser->token_count = 0;
> +        parser->token_size = 0;
> +    }
>  }
>  
>  void json_message_parser_init(JSONMessageParser *parser,
> @@ -133,9 +110,10 @@ void json_message_parser_init(JSONMessageParser *parser,
>  {
>      parser->emit = emit;
>      parser->opaque = opaque;
> +    parser->error = false;
>      parser->brace_count = 0;
>      parser->bracket_count = 0;
> -    g_queue_init(&parser->tokens);
> +    parser->token_count = 0;
>      parser->token_size = 0;
>  
>      json_parser_init(&parser->parser, ap);
> @@ -151,12 +129,10 @@ void json_message_parser_feed(JSONMessageParser *parser,
>  void json_message_parser_flush(JSONMessageParser *parser)
>  {
>      json_lexer_flush(&parser->lexer);
> -    assert(g_queue_is_empty(&parser->tokens));
>  }
>  
>  void json_message_parser_destroy(JSONMessageParser *parser)
>  {
>      json_lexer_destroy(&parser->lexer);
> -    json_message_free_tokens(parser);
>      json_parser_destroy(&parser->parser);
>  }



  reply	other threads:[~2026-06-29 13:03 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-26 10:17 [PATCH v4 0/6] qobject: switch JSON parser to push Paolo Bonzini
2026-06-26 10:17 ` [PATCH 1/6] json-parser: replace with a push parser Paolo Bonzini
2026-06-29 13:02   ` Markus Armbruster
2026-06-26 10:17 ` [PATCH 2/6] json-streamer: reuse parser Paolo Bonzini
2026-06-26 13:02   ` Philippe Mathieu-Daudé
2026-06-26 10:17 ` [PATCH 3/6] json-streamer: make brace/bracket count unsigned Paolo Bonzini
2026-06-26 10:17 ` [PATCH 4/6] json-streamer: remove token queue Paolo Bonzini
2026-06-29 13:02   ` Markus Armbruster [this message]
2026-06-26 10:17 ` [PATCH 5/6] json-streamer: do not heap-allocate JSONToken Paolo Bonzini
2026-06-26 10:17 ` [PATCH 6/6] json-parser: add location to JSON parsing errors Paolo Bonzini
2026-06-29 13:03 ` [PATCH v4 0/6] qobject: switch JSON parser to push Markus Armbruster

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=878q7xts77.fsf@pond.sub.org \
    --to=armbru@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.