From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from [140.186.70.92] (port=39742 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1OFAmj-0006If-ME for qemu-devel@nongnu.org; Thu, 20 May 2010 14:47:58 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69) (envelope-from ) id 1OFAmg-0000rw-33 for qemu-devel@nongnu.org; Thu, 20 May 2010 14:47:57 -0400 Received: from mx1.redhat.com ([209.132.183.28]:26540) by eggs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1OFAmf-0000rP-PL for qemu-devel@nongnu.org; Thu, 20 May 2010 14:47:54 -0400 Date: Thu, 20 May 2010 15:47:45 -0300 From: Luiz Capitulino Message-ID: <20100520154745.5b297d49@redhat.com> In-Reply-To: <4BF56964.8030603@codemonkey.ws> References: <1274303733-3700-1-git-send-email-lcapitulino@redhat.com> <1274303733-3700-3-git-send-email-lcapitulino@redhat.com> <4BF45BCF.5090300@codemonkey.ws> <20100520104433.1be3167c@redhat.com> <4BF55231.8020208@redhat.com> <4BF55A51.1080506@codemonkey.ws> <20100520132710.1e906771@redhat.com> <4BF56964.8030603@codemonkey.ws> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Subject: [Qemu-devel] Re: [PATCH 2/6] json-lexer: Handle missing escapes List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Anthony Liguori Cc: Paolo Bonzini , aliguori@us.ibm.com, qemu-devel@nongnu.org On Thu, 20 May 2010 11:55:00 -0500 Anthony Liguori wrote: > On 05/20/2010 11:27 AM, Luiz Capitulino wrote: > > On Thu, 20 May 2010 10:50:41 -0500 > > Anthony Liguori wrote: > > > > > >> On 05/20/2010 10:16 AM, Paolo Bonzini wrote: > >> > >>> On 05/20/2010 03:44 PM, Luiz Capitulino wrote: > >>> > >>>> I think there's another issue in the handling of strings. > >>>> > >>>> The spec says that valid unescaped chars are in the following range: > >>>> > >>>> unescaped = %x20-21 / %x23-5B / %x5D-10FFFF > >>>> > >> That's a spec bug IMHO. Tab is %x09. Surely you can include tabs in > >> strings. Any parser that didn't accept that would be broken. > >> > > Honestly, I had the impression this should be encoded as: %x5C %x74, but > > if you're right, wouldn't this be true for other sequences as well? > > > > I don't think most reasonable clients are going to quote tabs as '\t'. That would be a bug, wouldn't it? Python example: >>> json.dumps('\t') '"\\t"' >>> YAJL example (inlined below): /tmp/ ./teste 0x22 0x5c 0x74 0x22 /tmp/ I think we should strictly conform to the spec, quirks should only be added when really needed. #include #include int main(void) { yajl_gen g; unsigned int i, len = 0; const unsigned char *str = NULL; yajl_gen_config conf = { 0, " " }; g = yajl_gen_alloc(&conf, NULL); if (yajl_gen_string(g, (unsigned char *) "\t", 1) != yajl_gen_status_ok) return 1; if (yajl_gen_get_buf(g, &str, &len) != yajl_gen_status_ok) return 1; for (i = 0; i < len; i++) printf("0x%x ", str[i]); printf("\n"); return 0; }