From: Markus Armbruster <armbru@redhat.com>
To: Eric Blake <eblake@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>,
marcandre.lureau@redhat.com, qemu-devel@nongnu.org,
mdroth@linux.vnet.ibm.com
Subject: Re: [Qemu-devel] [PATCH 5/6] json: Eliminate lexer state IN_ERROR
Date: Fri, 31 Aug 2018 09:06:03 +0200 [thread overview]
Message-ID: <87ftyvt3qc.fsf@dusky.pond.sub.org> (raw)
In-Reply-To: <4556c609-2b90-43be-7456-4239eb116a55@redhat.com> (Eric Blake's message of "Tue, 28 Aug 2018 10:01:56 -0500")
Eric Blake <eblake@redhat.com> writes:
> On 08/27/2018 11:40 PM, Markus Armbruster wrote:
>
>>>> typedef enum json_token_type {
>>>> - JSON_MIN = 100,
>>>> - JSON_LCURLY = JSON_MIN,
>>>> + JSON_ERROR = 0, /* must be zero, see json_lexer[] */
>>>> + /* Gap for lexer states */
>>>> + JSON_LCURLY = 100,
>>>> + JSON_MIN = JSON_LCURLY,
>>>
>>> In an earlier version of this type of cleanup, you swapped the IN_ and
>>> JSON_ values and eliminated the gap, to make the overall table more
>>> compact (no storage wasted on any of the states in the gap between the
>>> two).
>>>
>>> https://lists.gnu.org/archive/html/qemu-devel/2018-08/msg01178.html
>>>
>>> Is it still worth trying to minimize the gap between the two
>>> sequences, even if you now no longer swap them in order?
>>
>> You caught me :)
>>
>> Eliminating the gap actually enlarges the table.
>
> Rather, switching the order enlarges the table.
>
>> I first got confused,
>> then measured the size change backwards to confirm my confused ideas.
>> When I looked at the patch again, I realized my mistake, and silently
>> dropped this part of the change.
>
> The size of the table is determined by the fact that we must
> initialize entry 0 (whether we spell it IN_ERROR or JSON_ERROR), then
> pay attention to the largest value assigned. Re-reading json_lexer[],
> you are only initializing IN_* states, and not JSON_* states;
Correct.
The JSON_* states other than JSON_ERROR all go to the start state
regardless of lookahead and without consuming it. We implement that
state transition in code instead of putting it into the table:
case JSON_STRING:
json_message_process_token(lexer, lexer->token, new_state,
lexer->x, lexer->y);
/* fall through */
case JSON_SKIP:
g_string_truncate(lexer->token, 0);
/* fall through */
case IN_START:
--> new_state = lexer->start_state;
break;
JSON_ERROR goes to IN_RECOVERY instead:
case JSON_ERROR:
json_message_process_token(lexer, lexer->token, JSON_ERROR,
lexer->x, lexer->y);
---> new_state = IN_RECOVERY;
/* fall through */
case IN_RECOVERY:
> swapping
> JSON_* to come first enlarged the table because you now have a bunch
> of additional rows in the table that are all 0-initialized to
> JSON_ERROR transitions.
Yes. These rows are never used.
> So at the end of the day, leaving IN_* to be first, and putting JSON_*
> second, makes sense.
>
> The question remains, then, if a fixed-size gap (by making JSON_MIN be
> exactly 100) is any smarter than a contiguous layout (by making
> JSON_MIN be IN_START_INTERP + 1). I can't see any strong reason for
> preferring one form over the other, so keeping the gap doesn't hurt.
The gap lets us hide the IN_* in json-lexer.c. Not sure it's worth the
trouble. We could move the IN_* to json-parser-int.h for simplicity.
Not sure that's worth the trouble, either :)
next prev parent reply other threads:[~2018-08-31 7:06 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-08-27 7:00 [Qemu-devel] [PATCH 0/6] json: More fixes, error reporting improvements, cleanups Markus Armbruster
2018-08-27 7:00 ` [Qemu-devel] [PATCH 1/6] json: Fix lexer for lookahead character beyond '\x7F' Markus Armbruster
2018-08-27 16:50 ` Eric Blake
2018-08-28 4:28 ` Markus Armbruster
2018-08-27 7:00 ` [Qemu-devel] [PATCH 2/6] json: Clean up how lexer consumes "end of input" Markus Armbruster
2018-08-27 16:58 ` Eric Blake
2018-08-28 4:28 ` Markus Armbruster
2018-08-27 7:00 ` [Qemu-devel] [PATCH 3/6] json: Make lexer's "character consumed" logic less confusing Markus Armbruster
2018-08-27 17:04 ` Eric Blake
2018-08-27 7:00 ` [Qemu-devel] [PATCH 4/6] json: Nicer recovery from lexical errors Markus Armbruster
2018-08-27 17:18 ` Eric Blake
2018-08-28 4:35 ` Markus Armbruster
2018-08-27 7:00 ` [Qemu-devel] [PATCH 5/6] json: Eliminate lexer state IN_ERROR Markus Armbruster
2018-08-27 17:20 ` Eric Blake
2018-08-27 17:29 ` Eric Blake
2018-08-28 4:40 ` Markus Armbruster
2018-08-28 15:01 ` Eric Blake
2018-08-28 15:04 ` Eric Blake
2018-08-31 7:08 ` Markus Armbruster
2018-08-31 7:06 ` Markus Armbruster [this message]
2018-08-27 7:00 ` [Qemu-devel] [PATCH 6/6] json: Eliminate lexer state IN_WHITESPACE, pseudo-token JSON_SKIP Markus Armbruster
2018-08-27 17:25 ` Eric Blake
2018-08-28 4:41 ` Markus Armbruster
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87ftyvt3qc.fsf@dusky.pond.sub.org \
--to=armbru@redhat.com \
--cc=eblake@redhat.com \
--cc=marcandre.lureau@redhat.com \
--cc=mdroth@linux.vnet.ibm.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.