All of lore.kernel.org
 help / color / mirror / Atom feed
From: Markus Armbruster <armbru@redhat.com>
To: Eric Blake <eblake@redhat.com>
Cc: qemu-devel@nongnu.org, marcandre.lureau@redhat.com,
	mdroth@linux.vnet.ibm.com
Subject: Re: [Qemu-devel] [PATCH 1/6] json: Fix lexer for lookahead character beyond '\x7F'
Date: Tue, 28 Aug 2018 06:28:30 +0200	[thread overview]
Message-ID: <87pny3qfm9.fsf@dusky.pond.sub.org> (raw)
In-Reply-To: <0a033e63-a980-b8e6-2f60-cd743cc6c94c@redhat.com> (Eric Blake's message of "Mon, 27 Aug 2018 11:50:08 -0500")

Eric Blake <eblake@redhat.com> writes:

> On 08/27/2018 02:00 AM, Markus Armbruster wrote:
>> The lexer fails to end a valid token when the lookahead character is
>> beyond '\x7F'.  For instance, input
>>
>>      true\xC2\xA2
>>
>> produces the tokens
>>
>>      JSON_ERROR     true\xC2
>>      JSON_ERROR     \xA2
>>
>> The first token should be
>>
>>      JSON_KEYWORD   true
>>
>> instead.
>
> As long as we still get a JSON_ERROR in the end.

We do: one for \xC2, and one for \xA2.  PATCH 4 will lose the second one.

>> The culprit is
>>
>>      #define TERMINAL(state) [0 ... 0x7F] = (state)
>>
>> It leaves [0x80..0xFF] zero, i.e. IN_ERROR.  Has always been broken.
>
> I wonder if that was done because it was assuming that valid input is
> only ASCII, and that any byte larger than 0x7f is invalid except
> within the context of a string.

Plausible thinko.

>                                  But whatever the reason for the
> original bug, your fix makes sense.
>
>> Fix it to initialize the complete array.
>
> Worth testsuite coverage?

Since lookahead bytes > 0x7F are always a parse error, all the bug can
do is swallow a TERMINAL() token right before a parse error.  The
TERMINAL() tokens are JSON_INTEGER, JSON_FLOAT, JSON_KEYWORD, JSON_SKIP,
JSON_INTERP.  Fairly harmless.  In particular, JSON objects get through
even when followed by a byte > 0x7F.

Of course, test coverage wouldn't hurt regardless.

>> Signed-off-by: Markus Armbruster <armbru@redhat.com>
>> ---
>>   qobject/json-lexer.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>
> Reviewed-by: Eric Blake <eblake@redhat.com>

Thanks!

  reply	other threads:[~2018-08-28  4:28 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-08-27  7:00 [Qemu-devel] [PATCH 0/6] json: More fixes, error reporting improvements, cleanups Markus Armbruster
2018-08-27  7:00 ` [Qemu-devel] [PATCH 1/6] json: Fix lexer for lookahead character beyond '\x7F' Markus Armbruster
2018-08-27 16:50   ` Eric Blake
2018-08-28  4:28     ` Markus Armbruster [this message]
2018-08-27  7:00 ` [Qemu-devel] [PATCH 2/6] json: Clean up how lexer consumes "end of input" Markus Armbruster
2018-08-27 16:58   ` Eric Blake
2018-08-28  4:28     ` Markus Armbruster
2018-08-27  7:00 ` [Qemu-devel] [PATCH 3/6] json: Make lexer's "character consumed" logic less confusing Markus Armbruster
2018-08-27 17:04   ` Eric Blake
2018-08-27  7:00 ` [Qemu-devel] [PATCH 4/6] json: Nicer recovery from lexical errors Markus Armbruster
2018-08-27 17:18   ` Eric Blake
2018-08-28  4:35     ` Markus Armbruster
2018-08-27  7:00 ` [Qemu-devel] [PATCH 5/6] json: Eliminate lexer state IN_ERROR Markus Armbruster
2018-08-27 17:20   ` Eric Blake
2018-08-27 17:29   ` Eric Blake
2018-08-28  4:40     ` Markus Armbruster
2018-08-28 15:01       ` Eric Blake
2018-08-28 15:04         ` Eric Blake
2018-08-31  7:08           ` Markus Armbruster
2018-08-31  7:06         ` Markus Armbruster
2018-08-27  7:00 ` [Qemu-devel] [PATCH 6/6] json: Eliminate lexer state IN_WHITESPACE, pseudo-token JSON_SKIP Markus Armbruster
2018-08-27 17:25   ` Eric Blake
2018-08-28  4:41     ` Markus Armbruster

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87pny3qfm9.fsf@dusky.pond.sub.org \
    --to=armbru@redhat.com \
    --cc=eblake@redhat.com \
    --cc=marcandre.lureau@redhat.com \
    --cc=mdroth@linux.vnet.ibm.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.