From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:35964) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Zrrut-0006D4-0b for qemu-devel@nongnu.org; Thu, 29 Oct 2015 14:27:15 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Zrruq-000773-AQ for qemu-devel@nongnu.org; Thu, 29 Oct 2015 14:27:14 -0400 Received: from mx1.redhat.com ([209.132.183.28]:44512) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Zrruq-00076x-4U for qemu-devel@nongnu.org; Thu, 29 Oct 2015 14:27:12 -0400 Received: from int-mx13.intmail.prod.int.phx2.redhat.com (int-mx13.intmail.prod.int.phx2.redhat.com [10.5.11.26]) by mx1.redhat.com (Postfix) with ESMTPS id B7CB2C0AEE29 for ; Thu, 29 Oct 2015 18:27:11 +0000 (UTC) From: Markus Armbruster References: <1446122683-2355-1-git-send-email-armbru@redhat.com> <1446122683-2355-5-git-send-email-armbru@redhat.com> <56324CB8.5060303@redhat.com> Date: Thu, 29 Oct 2015 19:27:09 +0100 In-Reply-To: <56324CB8.5060303@redhat.com> (Eric Blake's message of "Thu, 29 Oct 2015 10:43:36 -0600") Message-ID: <87io5psf7m.fsf@blackfin.pond.sub.org> MIME-Version: 1.0 Content-Type: text/plain Subject: Re: [Qemu-devel] [PATCH 4/4] json-streamer: Limit number of tokens in addition to total size List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Eric Blake Cc: qemu-devel@nongnu.org, lcapitulino@redhat.com Eric Blake writes: > On 10/29/2015 06:44 AM, Markus Armbruster wrote: >> Commit 29c75dd "json-streamer: limit the maximum recursion depth and >> maximum token count" attempts to guard against excessive heap usage by >> limiting total token size (it says "token count", but that's a lie). >> >> Total token size is a rather imprecise predictor of heap usage: many >> small tokens use more space than few large tokens with the same input >> size, because there's a constant per-token overhead. >> >> Tighten this up: limit the token count to 128Ki. >> >> If you think 128Ki is too stingy: check-qjson's large_dict test eats a >> sweet 500MiB and pegs a core for four minutes on my machine to parse >> ~100K tokens. Absurdly wasteful. > > Sounds like we have some quadratic (or worse) scaling in the parser. > Worth fixing some day, but I also agree that we don't have to tackle it > in this series. I believe it's linear with a criminally negligent constant (several KiB per token). The first hog is actually fairly obvious: we use on QDict per token. > I'm assuming you temporarily patched check-qjson to use larger constants > when you hit your ~100K token testing? Because I am definitely seeing a > lot of execution time spent on large_dict when running tests/check-qjson > by hand, in relation to all the other tests of that file, but not > minutes worth. Care to post the diff you played with? I tested on a slow machine. >> Signed-off-by: Markus Armbruster >> --- >> qobject/json-streamer.c | 2 ++ >> 1 file changed, 2 insertions(+) > > Reviewed-by: Eric Blake Thanks!