From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from [140.186.70.92] (port=38387 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1OF81P-00022Q-C5 for qemu-devel@nongnu.org; Thu, 20 May 2010 11:51:00 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69) (envelope-from ) id 1OF81H-0000GL-48 for qemu-devel@nongnu.org; Thu, 20 May 2010 11:50:55 -0400 Received: from mail-gy0-f173.google.com ([209.85.160.173]:36466) by eggs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1OF81G-0000G8-P9 for qemu-devel@nongnu.org; Thu, 20 May 2010 11:50:47 -0400 Received: by gyd5 with SMTP id 5so3587736gyd.4 for ; Thu, 20 May 2010 08:50:46 -0700 (PDT) Message-ID: <4BF55A51.1080506@codemonkey.ws> Date: Thu, 20 May 2010 10:50:41 -0500 From: Anthony Liguori MIME-Version: 1.0 References: <1274303733-3700-1-git-send-email-lcapitulino@redhat.com> <1274303733-3700-3-git-send-email-lcapitulino@redhat.com> <4BF45BCF.5090300@codemonkey.ws> <20100520104433.1be3167c@redhat.com> <4BF55231.8020208@redhat.com> In-Reply-To: <4BF55231.8020208@redhat.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: [Qemu-devel] Re: [PATCH 2/6] json-lexer: Handle missing escapes List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Paolo Bonzini Cc: aliguori@us.ibm.com, qemu-devel@nongnu.org, Luiz Capitulino On 05/20/2010 10:16 AM, Paolo Bonzini wrote: > On 05/20/2010 03:44 PM, Luiz Capitulino wrote: >> I think there's another issue in the handling of strings. >> >> The spec says that valid unescaped chars are in the following range: >> >> unescaped = %x20-21 / %x23-5B / %x5D-10FFFF That's a spec bug IMHO. Tab is %x09. Surely you can include tabs in strings. Any parser that didn't accept that would be broken. >> >> But we do: >> >> [IN_DQ_STRING] = { >> [1 ... 0xFF] = IN_DQ_STRING, >> ['\\'] = IN_DQ_STRING_ESCAPE, >> ['"'] = IN_DONE_STRING, >> }, >> >> Shouldn't we cover 0x20 .. 0xFF instead? > > If it's the lexer, isn't just it being liberal in what it accepts? I believe the parser correctly rejects invalid UTF-8 sequences. Regards, Anthony Liguori > paolo