From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from [140.186.70.92] (port=38387 helo=eggs.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.43) id 1OF81P-00022Q-C5
	for qemu-devel@nongnu.org; Thu, 20 May 2010 11:51:00 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69)
	(envelope-from <anthony@codemonkey.ws>) id 1OF81H-0000GL-48
	for qemu-devel@nongnu.org; Thu, 20 May 2010 11:50:55 -0400
Received: from mail-gy0-f173.google.com ([209.85.160.173]:36466)
	by eggs.gnu.org with esmtp (Exim 4.69)
	(envelope-from <anthony@codemonkey.ws>) id 1OF81G-0000G8-P9
	for qemu-devel@nongnu.org; Thu, 20 May 2010 11:50:47 -0400
Received: by gyd5 with SMTP id 5so3587736gyd.4
	for <qemu-devel@nongnu.org>; Thu, 20 May 2010 08:50:46 -0700 (PDT)
Message-ID: <4BF55A51.1080506@codemonkey.ws>
Date: Thu, 20 May 2010 10:50:41 -0500
From: Anthony Liguori <anthony@codemonkey.ws>
MIME-Version: 1.0
References: <1274303733-3700-1-git-send-email-lcapitulino@redhat.com>	<1274303733-3700-3-git-send-email-lcapitulino@redhat.com>	<4BF45BCF.5090300@codemonkey.ws>
	<20100520104433.1be3167c@redhat.com> <4BF55231.8020208@redhat.com>
In-Reply-To: <4BF55231.8020208@redhat.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Subject: [Qemu-devel] Re: [PATCH 2/6] json-lexer: Handle missing escapes
List-Id: qemu-devel.nongnu.org
List-Unsubscribe: <http://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <http://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: aliguori@us.ibm.com, qemu-devel@nongnu.org, Luiz Capitulino <lcapitulino@redhat.com>

On 05/20/2010 10:16 AM, Paolo Bonzini wrote:
> On 05/20/2010 03:44 PM, Luiz Capitulino wrote:
>>   I think there's another issue in the handling of strings.
>>
>>   The spec says that valid unescaped chars are in the following range:
>>
>>      unescaped = %x20-21 / %x23-5B / %x5D-10FFFF

That's a spec bug IMHO.  Tab is %x09.  Surely you can include tabs in 
strings.  Any parser that didn't accept that would be broken.

>>
>>   But we do:
>>
>>      [IN_DQ_STRING] = {
>>          [1 ... 0xFF] = IN_DQ_STRING,
>>          ['\\'] = IN_DQ_STRING_ESCAPE,
>>          ['"'] = IN_DONE_STRING,
>>      },
>>
>>   Shouldn't we cover 0x20 .. 0xFF instead?
>
> If it's the lexer, isn't just it being liberal in what it accepts?

I believe the parser correctly rejects invalid UTF-8 sequences.

Regards,

Anthony Liguori

> paolo