From: Markus Armbruster <armbru@redhat.com>
To: Eric Blake <eblake@redhat.com>
Cc: qemu-devel@nongnu.org, lcapitulino@redhat.com
Subject: Re: [Qemu-devel] [PATCH 4/4] json-streamer: Limit number of tokens in addition to total size
Date: Thu, 29 Oct 2015 19:27:09 +0100 [thread overview]
Message-ID: <87io5psf7m.fsf@blackfin.pond.sub.org> (raw)
In-Reply-To: <56324CB8.5060303@redhat.com> (Eric Blake's message of "Thu, 29 Oct 2015 10:43:36 -0600")
Eric Blake <eblake@redhat.com> writes:
> On 10/29/2015 06:44 AM, Markus Armbruster wrote:
>> Commit 29c75dd "json-streamer: limit the maximum recursion depth and
>> maximum token count" attempts to guard against excessive heap usage by
>> limiting total token size (it says "token count", but that's a lie).
>>
>> Total token size is a rather imprecise predictor of heap usage: many
>> small tokens use more space than few large tokens with the same input
>> size, because there's a constant per-token overhead.
>>
>> Tighten this up: limit the token count to 128Ki.
>>
>> If you think 128Ki is too stingy: check-qjson's large_dict test eats a
>> sweet 500MiB and pegs a core for four minutes on my machine to parse
>> ~100K tokens. Absurdly wasteful.
>
> Sounds like we have some quadratic (or worse) scaling in the parser.
> Worth fixing some day, but I also agree that we don't have to tackle it
> in this series.
I believe it's linear with a criminally negligent constant (several KiB
per token). The first hog is actually fairly obvious: we use on QDict
per token.
> I'm assuming you temporarily patched check-qjson to use larger constants
> when you hit your ~100K token testing? Because I am definitely seeing a
> lot of execution time spent on large_dict when running tests/check-qjson
> by hand, in relation to all the other tests of that file, but not
> minutes worth. Care to post the diff you played with?
I tested on a slow machine.
>> Signed-off-by: Markus Armbruster <armbru@redhat.com>
>> ---
>> qobject/json-streamer.c | 2 ++
>> 1 file changed, 2 insertions(+)
>
> Reviewed-by: Eric Blake <eblake@redhat.com>
Thanks!
next prev parent reply other threads:[~2015-10-29 18:27 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-10-29 12:44 [Qemu-devel] [PATCH 0/4] json-streamer: Fix up code to limit nesting and size Markus Armbruster
2015-10-29 12:44 ` [Qemu-devel] [PATCH 1/4] json-streamer: Apply nesting limit more sanely Markus Armbruster
2015-10-29 16:22 ` Eric Blake
2015-10-29 12:44 ` [Qemu-devel] [PATCH 2/4] json-streamer: Don't crash when input exceeds nesting limit Markus Armbruster
2015-10-29 16:25 ` Eric Blake
2015-11-23 17:21 ` Markus Armbruster
2015-10-29 12:44 ` [Qemu-devel] [PATCH 3/4] check-qjson: Add test for JSON nesting depth limit Markus Armbruster
2015-10-29 16:36 ` Eric Blake
2015-10-29 18:33 ` Markus Armbruster
2015-10-29 12:44 ` [Qemu-devel] [PATCH 4/4] json-streamer: Limit number of tokens in addition to total size Markus Armbruster
2015-10-29 16:43 ` Eric Blake
2015-10-29 18:27 ` Markus Armbruster [this message]
2015-10-29 23:35 ` Eric Blake
2015-10-30 7:52 ` Markus Armbruster
2015-10-30 15:22 ` Eric Blake
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87io5psf7m.fsf@blackfin.pond.sub.org \
--to=armbru@redhat.com \
--cc=eblake@redhat.com \
--cc=lcapitulino@redhat.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.