From: Luiz Capitulino <lcapitulino@redhat.com>
To: Anthony Liguori <anthony@codemonkey.ws>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
Anthony Liguori <aliguori@us.ibm.com>,
qemu-devel@nongnu.org, Michael Roth <mdroth@us.ibm.com>,
Markus Armbruster <armbru@redhat.com>
Subject: Re: [Qemu-devel] Re: [PATCH 09/11] json-lexer: limit the maximum size of a given token
Date: Mon, 14 Mar 2011 17:33:11 -0300 [thread overview]
Message-ID: <20110314173311.23dfa5f7@doriath> (raw)
In-Reply-To: <4D7E7831.1060306@codemonkey.ws>
On Mon, 14 Mar 2011 15:18:57 -0500
Anthony Liguori <anthony@codemonkey.ws> wrote:
> On 03/14/2011 02:25 PM, Luiz Capitulino wrote:
> > On Fri, 11 Mar 2011 15:00:47 -0600
> > Anthony Liguori<aliguori@us.ibm.com> wrote:
> >
> >> This is a security consideration. We don't want a client to cause an arbitrary
> >> amount of memory to be allocated in QEMU. For now, we use a limit of 64MB
> >> which should be large enough for any reasonably sized token.
> >>
> >> This is important for parsing JSON from untrusted sources.
> >>
> >> Signed-off-by: Anthony Liguori<aliguori@us.ibm.com>
> >>
> >> diff --git a/json-lexer.c b/json-lexer.c
> >> index 834d7af..3462c89 100644
> >> --- a/json-lexer.c
> >> +++ b/json-lexer.c
> >> @@ -18,6 +18,8 @@
> >> #include "qemu-common.h"
> >> #include "json-lexer.h"
> >>
> >> +#define MAX_TOKEN_SIZE (64ULL<< 20)
> >> +
> >> /*
> >> * \"([^\\\"]|(\\\"\\'\\\\\\/\\b\\f\\n\\r\\t\\u[0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F]))*\"
> >> * '([^\\']|(\\\"\\'\\\\\\/\\b\\f\\n\\r\\t\\u[0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F]))*'
> >> @@ -312,6 +314,17 @@ static int json_lexer_feed_char(JSONLexer *lexer, char ch)
> >> }
> >> lexer->state = new_state;
> >> } while (!char_consumed);
> >> +
> >> + /* Do not let a single token grow to an arbitrarily large size,
> >> + * this is a security consideration.
> >> + */
> >> + if (lexer->token->length> MAX_TOKEN_SIZE) {
> >> + lexer->emit(lexer, lexer->token, lexer->state, lexer->x, lexer->y);
> >> + QDECREF(lexer->token);
> >> + lexer->token = qstring_new();
> >> + lexer->state = IN_START;
> >> + }
> > Entering an invalid token is an error, we should fail here.
>
> It's not so clear to me.
>
> I think of it like GCC. GCC doesn't bail out on the first invalid
> character the lexer encounters. Instead, it records that an error has
> occurred and tries its best to recover.
>
> The result is that instead of getting an error message about the first
> error in your code, you'll get a long listing of all the mistakes you
> made (usually).
I'm not a compiler expert, but I think compilers do this because it would be
very problematic from a usability perspective to report one error at time:
by reporting most errors per run, the compiler gives the user to fix as much
as possible programming mistakes w/o having to re-run.
Our target are not humans, though.
> One thing that makes this more difficult in our case is that when you're
> testing, we don't have a clear EOI to flush things out. So a bad
> sequence of inputs might make the message parser wait for an a character
> that you're not necessarily inputting which makes the session appear
> hung. Usually, if you throw a couple extra brackets around, you'll get
> back to valid input.
Not sure if this is easy to do, but shouldn't we fail if we don't get
the expected character?
> > Which brings
> > two features:
> >
> > 1. A test code could trigger this condition and check for the specific
> > error code
> >
> > 2. Developers will know when they hit the limit. Although I don't expect
> > expect this to happen, there was talking about adding base64 support
> > to transfer something (I can't remember what, but we never know how the
> > protocol will evolve).
> >
> > Also, by testing this I found that the parser seems to get confused when
> > the limit is reached: it stops responding.
>
> Actually, it does respond. The lexer just takes an incredibly long time
> to process a large token because qstring_append_ch is incredibly slow
> :-) If you drop the token size down to 64k instead of 64mb, it'll seem
> a lot more reasonable.
Actually, I dropped it down to 20 :) This is supposed to work like your 64k
suggestion, right?
>
> Regards,
>
> Anthony Liguori
>
> >> +
> >> return 0;
> >> }
> >>
> >
>
next prev parent reply other threads:[~2011-03-14 20:33 UTC|newest]
Thread overview: 43+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-03-11 21:00 [Qemu-devel] [00/11] QAPI Round 0 (JSON improvements) Anthony Liguori
2011-03-11 21:00 ` [Qemu-devel] [PATCH 01/11] Add hard build dependency on glib Anthony Liguori
2011-03-12 8:09 ` [Qemu-devel] " Paolo Bonzini
2011-03-12 14:52 ` Anthony Liguori
2011-03-11 21:00 ` [Qemu-devel] [PATCH 02/11] qerror: expose a function to format an error Anthony Liguori
2011-03-11 21:08 ` [Qemu-devel] " Anthony Liguori
2011-03-14 19:17 ` Luiz Capitulino
2011-03-14 19:27 ` Anthony Liguori
2011-03-14 19:37 ` Luiz Capitulino
2011-03-14 19:45 ` Anthony Liguori
2011-03-14 20:22 ` Luiz Capitulino
2011-03-14 20:41 ` Anthony Liguori
2011-03-14 20:48 ` Luiz Capitulino
2011-03-14 21:03 ` Anthony Liguori
2011-03-11 21:00 ` [Qemu-devel] [PATCH 03/11] add a generic Error object Anthony Liguori
2011-03-12 11:05 ` Blue Swirl
2011-03-12 14:51 ` Anthony Liguori
2011-03-14 19:18 ` [Qemu-devel] " Luiz Capitulino
2011-03-14 19:34 ` Anthony Liguori
2011-03-14 19:57 ` Luiz Capitulino
2011-03-11 21:00 ` [Qemu-devel] [PATCH 04/11] qerror: split out the reporting bits of QError Anthony Liguori
2011-03-14 19:18 ` [Qemu-devel] " Luiz Capitulino
2011-03-14 19:24 ` Anthony Liguori
2011-03-14 19:30 ` Luiz Capitulino
2011-03-14 20:30 ` Anthony Liguori
2011-03-11 21:00 ` [Qemu-devel] [PATCH 05/11] qerror: add new error message for invalid enum values Anthony Liguori
2011-03-14 19:19 ` [Qemu-devel] " Luiz Capitulino
2011-03-11 21:00 ` [Qemu-devel] [PATCH 06/11] qerror: add JSON parsing error message Anthony Liguori
2011-03-11 21:00 ` [Qemu-devel] [PATCH 07/11] json: propagate error from parser Anthony Liguori
2011-03-11 21:00 ` [Qemu-devel] [PATCH 08/11] json-lexer: reset the lexer state on an invalid token Anthony Liguori
2011-03-14 19:22 ` [Qemu-devel] " Luiz Capitulino
2011-03-14 19:43 ` Anthony Liguori
2011-03-14 20:12 ` Luiz Capitulino
2011-03-14 20:30 ` Anthony Liguori
2011-03-14 20:43 ` Luiz Capitulino
2011-03-11 21:00 ` [Qemu-devel] [PATCH 09/11] json-lexer: limit the maximum size of a given token Anthony Liguori
2011-03-14 19:25 ` [Qemu-devel] " Luiz Capitulino
2011-03-14 20:18 ` Anthony Liguori
2011-03-14 20:33 ` Luiz Capitulino [this message]
2011-03-11 21:00 ` [Qemu-devel] [PATCH 10/11] json-streamer: limit the maximum recursion depth and maximum token count Anthony Liguori
2011-03-11 23:16 ` Michael Roth
2011-03-12 15:03 ` Anthony Liguori
2011-03-11 21:00 ` [Qemu-devel] [PATCH 11/11] json-parser: detect premature EOI Anthony Liguori
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110314173311.23dfa5f7@doriath \
--to=lcapitulino@redhat.com \
--cc=aliguori@us.ibm.com \
--cc=anthony@codemonkey.ws \
--cc=armbru@redhat.com \
--cc=mdroth@us.ibm.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).