From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from [140.186.70.92] (port=34795 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1OF8al-00075q-2v for qemu-devel@nongnu.org; Thu, 20 May 2010 12:27:31 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69) (envelope-from ) id 1OF8ae-0005u0-0l for qemu-devel@nongnu.org; Thu, 20 May 2010 12:27:27 -0400 Received: from mx1.redhat.com ([209.132.183.28]:5124) by eggs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1OF8ad-0005tp-PM for qemu-devel@nongnu.org; Thu, 20 May 2010 12:27:19 -0400 Date: Thu, 20 May 2010 13:27:10 -0300 From: Luiz Capitulino Message-ID: <20100520132710.1e906771@redhat.com> In-Reply-To: <4BF55A51.1080506@codemonkey.ws> References: <1274303733-3700-1-git-send-email-lcapitulino@redhat.com> <1274303733-3700-3-git-send-email-lcapitulino@redhat.com> <4BF45BCF.5090300@codemonkey.ws> <20100520104433.1be3167c@redhat.com> <4BF55231.8020208@redhat.com> <4BF55A51.1080506@codemonkey.ws> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Subject: [Qemu-devel] Re: [PATCH 2/6] json-lexer: Handle missing escapes List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Anthony Liguori Cc: Paolo Bonzini , aliguori@us.ibm.com, qemu-devel@nongnu.org On Thu, 20 May 2010 10:50:41 -0500 Anthony Liguori wrote: > On 05/20/2010 10:16 AM, Paolo Bonzini wrote: > > On 05/20/2010 03:44 PM, Luiz Capitulino wrote: > >> I think there's another issue in the handling of strings. > >> > >> The spec says that valid unescaped chars are in the following range: > >> > >> unescaped = %x20-21 / %x23-5B / %x5D-10FFFF > > That's a spec bug IMHO. Tab is %x09. Surely you can include tabs in > strings. Any parser that didn't accept that would be broken. Honestly, I had the impression this should be encoded as: %x5C %x74, but if you're right, wouldn't this be true for other sequences as well? > >> > >> But we do: > >> > >> [IN_DQ_STRING] = { > >> [1 ... 0xFF] = IN_DQ_STRING, > >> ['\\'] = IN_DQ_STRING_ESCAPE, > >> ['"'] = IN_DONE_STRING, > >> }, > >> > >> Shouldn't we cover 0x20 .. 0xFF instead? > > > > If it's the lexer, isn't just it being liberal in what it accepts? > > I believe the parser correctly rejects invalid UTF-8 sequences. Will check.