From: Markus Armbruster <armbru@redhat.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: qemu-devel@nongnu.org
Subject: Re: [PATCH 4/6] json-streamer: remove token queue
Date: Mon, 29 Jun 2026 15:02:52 +0200 [thread overview]
Message-ID: <878q7xts77.fsf@pond.sub.org> (raw)
In-Reply-To: <20260626101727.1727389-5-pbonzini@redhat.com> (Paolo Bonzini's message of "Fri, 26 Jun 2026 12:17:24 +0200")
Paolo Bonzini <pbonzini@redhat.com> writes:
> Now fully exploit the push parser, feeding it one token at a time
> without having to wait until braces and brackets are balanced.
>
> While the nesting counts are retained for error recovery purposes,
> the system can now report the first parsing error without waiting
> for parentheses to be balanced. This also means that JSON_ERROR
> can be handled in json-parser.c, not json-streamer.c.
>
> After reporting the error, json-streamer.c then enters an error recovery
> mode where subsequent errors are suppressed. This mimics the previous
> error reporting behavior, but it provides prompt feedback on parsing
> errors. As an example, here is an example interaction with qemu-ga.
>
> BEFORE (error reported only once braces are balanced):
>
> >> {"execute":foo
> >> }
> << {"error": {"class": "GenericError", "desc": "JSON parse error, invalid keyword 'foo'"}}
> >> {"execute":"somecommand"}
> << {"error": {"class": "CommandNotFound", "desc": "The command somecommand has not been found"}}
>
> AFTER (error reported immediately, but similar error recovery as before):
>
> >> {"execute":foo
> << {"error": {"class": "GenericError", "desc": "JSON parse error, invalid keyword 'foo'"}}
> >> }
> >> {"execute":"somecommand"}
> << {"error": {"class": "CommandNotFound", "desc": "The command somecommand has not been found"}}
>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
> include/qobject/json-parser.h | 3 +-
> qobject/json-parser.c | 4 ++
> qobject/json-streamer.c | 106 +++++++++++++---------------------
> 3 files changed, 47 insertions(+), 66 deletions(-)
>
> diff --git a/include/qobject/json-parser.h b/include/qobject/json-parser.h
> index 0cf6932ecdc..3479e637588 100644
> --- a/include/qobject/json-parser.h
> +++ b/include/qobject/json-parser.h
> @@ -33,7 +33,8 @@ typedef struct JSONMessageParser {
> JSONParserContext parser;
> unsigned int brace_count;
> unsigned int bracket_count;
> - GQueue tokens;
> + unsigned int token_count;
> + bool error;
> uint64_t token_size;
> } JSONMessageParser;
>
> diff --git a/qobject/json-parser.c b/qobject/json-parser.c
> index 845da3699aa..484956deae4 100644
> --- a/qobject/json-parser.c
> +++ b/qobject/json-parser.c
> @@ -673,6 +673,10 @@ QObject *json_parser_feed(JSONParserContext *ctxt, const JSONToken *token,
>
> assert(!ctxt->err);
> switch (token->type) {
> + case JSON_ERROR:
> + parse_error(ctxt, token, "stray '%s'", token->str);
> + break;
> +
> case JSON_END_OF_INPUT:
> /* Check for premature end of input */
> if (!g_queue_is_empty(ctxt->stack)) {
> diff --git a/qobject/json-streamer.c b/qobject/json-streamer.c
> index 9e1f650bad8..9526f815f00 100644
> --- a/qobject/json-streamer.c
> +++ b/qobject/json-streamer.c
> @@ -1,5 +1,5 @@
> /*
> - * JSON streaming support
> + * JSON parser - callback interface and error recovery
> *
> * Copyright IBM, Corp. 2009
> *
> @@ -19,23 +19,16 @@
> #define MAX_TOKEN_COUNT (2ULL << 20)
> #define MAX_NESTING (1 << 10)
>
> -static void json_message_free_tokens(JSONMessageParser *parser)
> -{
> - JSONToken *token;
> -
> - while ((token = g_queue_pop_head(&parser->tokens))) {
> - g_free(token);
> - }
> -}
> -
> void json_message_process_token(JSONLexer *lexer, GString *input,
> JSONTokenType type, int x, int y)
> {
> JSONMessageParser *parser = container_of(lexer, JSONMessageParser, lexer);
> - QObject *json = NULL;
> Error *err = NULL;
> - JSONToken *token;
>
> + parser->token_size += input->len;
> + parser->token_count++;
> +
> + /* Detect message boundaries for error recovery purposes. */
> switch (type) {
> case JSON_LCURLY:
> parser->brace_count++;
> @@ -56,19 +49,9 @@ void json_message_process_token(JSONLexer *lexer, GString *input,
> parser->bracket_count--;
> break;
> case JSON_ERROR:
> - error_setg(&err, "JSON parse error, stray '%s'", input->str);
> - goto out_emit;
> - case JSON_END_OF_INPUT:
> - /*
> - * Force the parentheses to appear balanced and the queue
> - * to be emptied, causing a parse error if it wasn't.
> - */
> - if (g_queue_is_empty(&parser->tokens)) {
> - return;
> - }
> end_error_recovery:
> /*
> - * We goto here due to receiving either JSON_ERROR or a
> + * We come here due to receiving either JSON_ERROR or a
Line was added in the previous commit. Squash the change into it?
> * JSON_R{CURLY,SQUARE}) that is known to be unbalanced.
> * If in error recovery, end it immediately. If not in
> * error recovery, json_parser_feed() will raise an error
> @@ -81,49 +64,43 @@ void json_message_process_token(JSONLexer *lexer, GString *input,
> break;
> }
>
> - /*
> - * Security consideration, we limit total memory allocated per object
> - * and the maximum recursion depth that a message can force.
> - */
> - if (parser->token_size + input->len + 1 > MAX_TOKEN_SIZE) {
Left operand of > is unincremented token_size plus increment plus 1.
> - error_setg(&err, "JSON token size limit exceeded");
> - goto out_emit;
> - }
> - if (g_queue_get_length(&parser->tokens) + 1 > MAX_TOKEN_COUNT) {
> - error_setg(&err, "JSON token count limit exceeded");
> - goto out_emit;
> - }
> - if (parser->bracket_count + parser->brace_count > MAX_NESTING) {
> - error_setg(&err, "JSON nesting depth limit exceeded");
> - goto out_emit;
> - }
> + if (parser->error) {
> + /* error recovery, eat tokens until parentheses balance */
> + } else {
> + /*
> + * Safety consideration, we limit total memory allocated per object
> + * and the maximum nesting depth that a message can force.
> + */
> + if (parser->token_size > MAX_TOKEN_SIZE) {
Left operand of > is incremented token size.
I believe this is one less than before the patch. Testing... yes:
-blockdev '{"a":"01234567890123456789012345678901234567890123456789012345"}'
with MAX_TOKEN_SIZE hacked to 64: is rejected before the series, and
accepted afterwards.
Obvious fix: change > to >=.
If you'd prefer not to change the code, mention the change in the commit
message.
> + error_setg(&err, "JSON token size limit exceeded");
> + } else if (parser->token_count > MAX_TOKEN_COUNT) {
> + error_setg(&err, "JSON token count limit exceeded");
> + } else if (parser->bracket_count + parser->brace_count > MAX_NESTING) {
> + error_setg(&err, "JSON nesting depth limit exceeded");
> + } else {
> + g_autofree JSONToken *token = json_token(type, x, y, input);
> + QObject *json = json_parser_feed(&parser->parser, token, &err);
> + if (json) {
> + parser->emit(parser->opaque, json, NULL);
> + }
> + }
>
> - token = json_token(type, x, y, input);
> - parser->token_size += input->len;
> -
> - g_queue_push_tail(&parser->tokens, token);
> -
> - if (parser->brace_count > 0 || parser->bracket_count > 0) {
> - return;
> - }
> -
> - /* Process all tokens in the queue */
> - while (!g_queue_is_empty(&parser->tokens)) {
> - token = g_queue_pop_head(&parser->tokens);
> - json = json_parser_feed(&parser->parser, token, &err);
> - g_free(token);
> - if (json || err) {
> - break;
> + if (err) {
> + parser->emit(parser->opaque, NULL, err);
> + /* start recovery */
> + parser->error = true;
> }
> }
>
> -out_emit:
> - json_parser_reset(&parser->parser);
> - parser->brace_count = 0;
> - parser->bracket_count = 0;
> - json_message_free_tokens(parser);
> - parser->token_size = 0;
> - parser->emit(parser->opaque, json, err);
> + if ((parser->brace_count == 0 && parser->bracket_count == 0)
> + || type == JSON_END_OF_INPUT) {
> + json_parser_reset(&parser->parser);
> + parser->error = false;
> + parser->brace_count = 0;
> + parser->bracket_count = 0;
> + parser->token_count = 0;
> + parser->token_size = 0;
> + }
> }
>
> void json_message_parser_init(JSONMessageParser *parser,
> @@ -133,9 +110,10 @@ void json_message_parser_init(JSONMessageParser *parser,
> {
> parser->emit = emit;
> parser->opaque = opaque;
> + parser->error = false;
> parser->brace_count = 0;
> parser->bracket_count = 0;
> - g_queue_init(&parser->tokens);
> + parser->token_count = 0;
> parser->token_size = 0;
>
> json_parser_init(&parser->parser, ap);
> @@ -151,12 +129,10 @@ void json_message_parser_feed(JSONMessageParser *parser,
> void json_message_parser_flush(JSONMessageParser *parser)
> {
> json_lexer_flush(&parser->lexer);
> - assert(g_queue_is_empty(&parser->tokens));
> }
>
> void json_message_parser_destroy(JSONMessageParser *parser)
> {
> json_lexer_destroy(&parser->lexer);
> - json_message_free_tokens(parser);
> json_parser_destroy(&parser->parser);
> }
next prev parent reply other threads:[~2026-06-29 13:03 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-26 10:17 [PATCH v4 0/6] qobject: switch JSON parser to push Paolo Bonzini
2026-06-26 10:17 ` [PATCH 1/6] json-parser: replace with a push parser Paolo Bonzini
2026-06-29 13:02 ` Markus Armbruster
2026-06-26 10:17 ` [PATCH 2/6] json-streamer: reuse parser Paolo Bonzini
2026-06-26 13:02 ` Philippe Mathieu-Daudé
2026-06-26 10:17 ` [PATCH 3/6] json-streamer: make brace/bracket count unsigned Paolo Bonzini
2026-06-26 10:17 ` [PATCH 4/6] json-streamer: remove token queue Paolo Bonzini
2026-06-29 13:02 ` Markus Armbruster [this message]
2026-06-26 10:17 ` [PATCH 5/6] json-streamer: do not heap-allocate JSONToken Paolo Bonzini
2026-06-26 10:17 ` [PATCH 6/6] json-parser: add location to JSON parsing errors Paolo Bonzini
2026-06-29 13:03 ` [PATCH v4 0/6] qobject: switch JSON parser to push Markus Armbruster
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=878q7xts77.fsf@pond.sub.org \
--to=armbru@redhat.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox