QEMU-Devel Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Markus Armbruster <armbru@redhat.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: qemu-devel@nongnu.org
Subject: Re: [PATCH 4/6] json-streamer: remove token queue
Date: Mon, 29 Jun 2026 15:02:52 +0200	[thread overview]
Message-ID: <878q7xts77.fsf@pond.sub.org> (raw)
In-Reply-To: <20260626101727.1727389-5-pbonzini@redhat.com> (Paolo Bonzini's message of "Fri, 26 Jun 2026 12:17:24 +0200")

Paolo Bonzini <pbonzini@redhat.com> writes:

> Now fully exploit the push parser, feeding it one token at a time
> without having to wait until braces and brackets are balanced.
>
> While the nesting counts are retained for error recovery purposes,
> the system can now report the first parsing error without waiting
> for parentheses to be balanced.  This also means that JSON_ERROR
> can be handled in json-parser.c, not json-streamer.c.
>
> After reporting the error, json-streamer.c then enters an error recovery
> mode where subsequent errors are suppressed.  This mimics the previous
> error reporting behavior, but it provides prompt feedback on parsing
> errors.  As an example, here is an example interaction with qemu-ga.
>
> BEFORE (error reported only once braces are balanced):
>
>    >> {"execute":foo
>    >> }
>    << {"error": {"class": "GenericError", "desc": "JSON parse error, invalid keyword 'foo'"}}
>    >> {"execute":"somecommand"}
>    << {"error": {"class": "CommandNotFound", "desc": "The command somecommand has not been found"}}
>
> AFTER (error reported immediately, but similar error recovery as before):
>
>    >> {"execute":foo
>    << {"error": {"class": "GenericError", "desc": "JSON parse error, invalid keyword 'foo'"}}
>    >> }
>    >> {"execute":"somecommand"}
>    << {"error": {"class": "CommandNotFound", "desc": "The command somecommand has not been found"}}
>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  include/qobject/json-parser.h |   3 +-
>  qobject/json-parser.c         |   4 ++
>  qobject/json-streamer.c       | 106 +++++++++++++---------------------
>  3 files changed, 47 insertions(+), 66 deletions(-)
>
> diff --git a/include/qobject/json-parser.h b/include/qobject/json-parser.h
> index 0cf6932ecdc..3479e637588 100644
> --- a/include/qobject/json-parser.h
> +++ b/include/qobject/json-parser.h
> @@ -33,7 +33,8 @@ typedef struct JSONMessageParser {
>      JSONParserContext parser;
>      unsigned int brace_count;
>      unsigned int bracket_count;
> -    GQueue tokens;
> +    unsigned int token_count;
> +    bool error;
>      uint64_t token_size;
>  } JSONMessageParser;
>  
> diff --git a/qobject/json-parser.c b/qobject/json-parser.c
> index 845da3699aa..484956deae4 100644
> --- a/qobject/json-parser.c
> +++ b/qobject/json-parser.c
> @@ -673,6 +673,10 @@ QObject *json_parser_feed(JSONParserContext *ctxt, const JSONToken *token,
>  
>      assert(!ctxt->err);
>      switch (token->type) {
> +    case JSON_ERROR:
> +        parse_error(ctxt, token, "stray '%s'", token->str);
> +        break;
> +
>      case JSON_END_OF_INPUT:
>          /* Check for premature end of input */
>          if (!g_queue_is_empty(ctxt->stack)) {
> diff --git a/qobject/json-streamer.c b/qobject/json-streamer.c
> index 9e1f650bad8..9526f815f00 100644
> --- a/qobject/json-streamer.c
> +++ b/qobject/json-streamer.c
> @@ -1,5 +1,5 @@
>  /*
> - * JSON streaming support
> + * JSON parser - callback interface and error recovery
>   *
>   * Copyright IBM, Corp. 2009
>   *
> @@ -19,23 +19,16 @@
>  #define MAX_TOKEN_COUNT (2ULL << 20)
>  #define MAX_NESTING (1 << 10)
>  
> -static void json_message_free_tokens(JSONMessageParser *parser)
> -{
> -    JSONToken *token;
> -
> -    while ((token = g_queue_pop_head(&parser->tokens))) {
> -        g_free(token);
> -    }
> -}
> -
>  void json_message_process_token(JSONLexer *lexer, GString *input,
>                                  JSONTokenType type, int x, int y)
>  {
>      JSONMessageParser *parser = container_of(lexer, JSONMessageParser, lexer);
> -    QObject *json = NULL;
>      Error *err = NULL;
> -    JSONToken *token;
>  
> +    parser->token_size += input->len;
> +    parser->token_count++;
> +
> +    /* Detect message boundaries for error recovery purposes.  */
>      switch (type) {
>      case JSON_LCURLY:
>          parser->brace_count++;
> @@ -56,19 +49,9 @@ void json_message_process_token(JSONLexer *lexer, GString *input,
>          parser->bracket_count--;
>          break;
>      case JSON_ERROR:
> -        error_setg(&err, "JSON parse error, stray '%s'", input->str);
> -        goto out_emit;
> -    case JSON_END_OF_INPUT:
> -        /*
> -         * Force the parentheses to appear balanced and the queue
> -         * to be emptied, causing a parse error if it wasn't.
> -         */
> -        if (g_queue_is_empty(&parser->tokens)) {
> -            return;
> -        }
>      end_error_recovery:
>          /*
> -         * We goto here due to receiving either JSON_ERROR or a
> +         * We come here due to receiving either JSON_ERROR or a

Line was added in the previous commit.  Squash the change into it?

>           * JSON_R{CURLY,SQUARE}) that is known to be unbalanced.
>           * If in error recovery, end it immediately.  If not in
>           * error recovery, json_parser_feed() will raise an error
> @@ -81,49 +64,43 @@ void json_message_process_token(JSONLexer *lexer, GString *input,
>          break;
>      }
>  
> -    /*
> -     * Security consideration, we limit total memory allocated per object
> -     * and the maximum recursion depth that a message can force.
> -     */
> -    if (parser->token_size + input->len + 1 > MAX_TOKEN_SIZE) {

Left operand of > is unincremented token_size plus increment plus 1.

> -        error_setg(&err, "JSON token size limit exceeded");
> -        goto out_emit;
> -    }
> -    if (g_queue_get_length(&parser->tokens) + 1 > MAX_TOKEN_COUNT) {
> -        error_setg(&err, "JSON token count limit exceeded");
> -        goto out_emit;
> -    }
> -    if (parser->bracket_count + parser->brace_count > MAX_NESTING) {
> -        error_setg(&err, "JSON nesting depth limit exceeded");
> -        goto out_emit;
> -    }
> +    if (parser->error) {
> +        /* error recovery, eat tokens until parentheses balance */
> +    } else {
> +        /*
> +         * Safety consideration, we limit total memory allocated per object
> +         * and the maximum nesting depth that a message can force.
> +         */
> +        if (parser->token_size > MAX_TOKEN_SIZE) {

Left operand of > is incremented token size.

I believe this is one less than before the patch.  Testing...  yes:

    -blockdev '{"a":"01234567890123456789012345678901234567890123456789012345"}'

with MAX_TOKEN_SIZE hacked to 64: is rejected before the series, and
accepted afterwards.

Obvious fix: change > to >=.

If you'd prefer not to change the code, mention the change in the commit
message.

> +            error_setg(&err, "JSON token size limit exceeded");
> +        } else if (parser->token_count > MAX_TOKEN_COUNT) {
> +            error_setg(&err, "JSON token count limit exceeded");
> +        } else if (parser->bracket_count + parser->brace_count > MAX_NESTING) {
> +            error_setg(&err, "JSON nesting depth limit exceeded");
> +        } else {
> +            g_autofree JSONToken *token = json_token(type, x, y, input);
> +            QObject *json = json_parser_feed(&parser->parser, token, &err);
> +            if (json) {
> +                parser->emit(parser->opaque, json, NULL);
> +            }
> +        }
>  
> -    token = json_token(type, x, y, input);
> -    parser->token_size += input->len;
> -
> -    g_queue_push_tail(&parser->tokens, token);
> -
> -    if (parser->brace_count > 0 || parser->bracket_count > 0) {
> -        return;
> -    }
> -
> -    /* Process all tokens in the queue */
> -    while (!g_queue_is_empty(&parser->tokens)) {
> -        token = g_queue_pop_head(&parser->tokens);
> -        json = json_parser_feed(&parser->parser, token, &err);
> -        g_free(token);
> -        if (json || err) {
> -            break;
> +        if (err) {
> +            parser->emit(parser->opaque, NULL, err);
> +            /* start recovery */
> +            parser->error = true;
>          }
>      }
>  
> -out_emit:
> -    json_parser_reset(&parser->parser);
> -    parser->brace_count = 0;
> -    parser->bracket_count = 0;
> -    json_message_free_tokens(parser);
> -    parser->token_size = 0;
> -    parser->emit(parser->opaque, json, err);
> +    if ((parser->brace_count == 0 && parser->bracket_count == 0)
> +        || type == JSON_END_OF_INPUT) {
> +        json_parser_reset(&parser->parser);
> +        parser->error = false;
> +        parser->brace_count = 0;
> +        parser->bracket_count = 0;
> +        parser->token_count = 0;
> +        parser->token_size = 0;
> +    }
>  }
>  
>  void json_message_parser_init(JSONMessageParser *parser,
> @@ -133,9 +110,10 @@ void json_message_parser_init(JSONMessageParser *parser,
>  {
>      parser->emit = emit;
>      parser->opaque = opaque;
> +    parser->error = false;
>      parser->brace_count = 0;
>      parser->bracket_count = 0;
> -    g_queue_init(&parser->tokens);
> +    parser->token_count = 0;
>      parser->token_size = 0;
>  
>      json_parser_init(&parser->parser, ap);
> @@ -151,12 +129,10 @@ void json_message_parser_feed(JSONMessageParser *parser,
>  void json_message_parser_flush(JSONMessageParser *parser)
>  {
>      json_lexer_flush(&parser->lexer);
> -    assert(g_queue_is_empty(&parser->tokens));
>  }
>  
>  void json_message_parser_destroy(JSONMessageParser *parser)
>  {
>      json_lexer_destroy(&parser->lexer);
> -    json_message_free_tokens(parser);
>      json_parser_destroy(&parser->parser);
>  }



  reply	other threads:[~2026-06-29 13:03 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-26 10:17 [PATCH v4 0/6] qobject: switch JSON parser to push Paolo Bonzini
2026-06-26 10:17 ` [PATCH 1/6] json-parser: replace with a push parser Paolo Bonzini
2026-06-29 13:02   ` Markus Armbruster
2026-06-26 10:17 ` [PATCH 2/6] json-streamer: reuse parser Paolo Bonzini
2026-06-26 13:02   ` Philippe Mathieu-Daudé
2026-06-26 10:17 ` [PATCH 3/6] json-streamer: make brace/bracket count unsigned Paolo Bonzini
2026-06-26 10:17 ` [PATCH 4/6] json-streamer: remove token queue Paolo Bonzini
2026-06-29 13:02   ` Markus Armbruster [this message]
2026-06-26 10:17 ` [PATCH 5/6] json-streamer: do not heap-allocate JSONToken Paolo Bonzini
2026-06-26 10:17 ` [PATCH 6/6] json-parser: add location to JSON parsing errors Paolo Bonzini
2026-06-29 13:03 ` [PATCH v4 0/6] qobject: switch JSON parser to push Markus Armbruster

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=878q7xts77.fsf@pond.sub.org \
    --to=armbru@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox