From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1MyyQd-0002z4-4D for qemu-devel@nongnu.org; Fri, 16 Oct 2009 21:49:55 -0400 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1MyyQY-0002yC-FR for qemu-devel@nongnu.org; Fri, 16 Oct 2009 21:49:54 -0400 Received: from [199.232.76.173] (port=47336 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1MyyQY-0002y9-8X for qemu-devel@nongnu.org; Fri, 16 Oct 2009 21:49:50 -0400 Received: from mail-qy0-f199.google.com ([209.85.221.199]:35769) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1MyyQX-0001rA-6F for qemu-devel@nongnu.org; Fri, 16 Oct 2009 21:49:49 -0400 Received: by qyk37 with SMTP id 37so354242qyk.18 for ; Fri, 16 Oct 2009 18:49:48 -0700 (PDT) Message-ID: <4AD922B9.50902@codemonkey.ws> Date: Fri, 16 Oct 2009 20:49:45 -0500 From: Anthony Liguori MIME-Version: 1.0 References: <1255037747-3340-1-git-send-email-lcapitulino@redhat.com> <1255037747-3340-2-git-send-email-lcapitulino@redhat.com> <4AD72B88.2040107@codemonkey.ws> <20091015122622.1f93ea2d@doriath> <20091015163936.GB532@redhat.com> <20091015142837.6c90580a@doriath> <4AD76B3C.3050001@codemonkey.ws> <4AD87424.3010000@redhat.com> <4AD87901.5030705@codemonkey.ws> <4AD8AECE.9000507@redhat.com> <4AD8AFA4.4070203@codemonkey.ws> <4AD8CB31.9080809@redhat.com> <4AD8E7B5.8000509@codemonkey.ws> <4AD910BA.4090607@gnu.org> In-Reply-To: <4AD910BA.4090607@gnu.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: [Qemu-devel] Re: [PATCH 01/10] Introduce qmisc module List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Paolo Bonzini Cc: qemu-devel@nongnu.org, Luiz Capitulino Paolo Bonzini wrote: > On 10/16/2009 11:37 PM, Anthony Liguori wrote: >> >> I already am :-) Stay tuned, I should have a patch later this >> afternoon. > > Was it a race? (Seriously, sorry I didn't notice a couple of hours ago). > > This one is ~5% slower than the "Evil" one, but half the size. Tested > against the comments.json file from the "Evil" parser and with > valgrind too. Does all the funky Unicode stuff too. > > Paolo > /* > * An event-based, asynchronous JSON parser. > * > * Copyright (C) 2009 Red Hat Inc. > * > * Authors: > * Paolo Bonzini > * > * Permission is hereby granted, free of charge, to any person obtaining a copy > * of this software and associated documentation files (the "Software"), to deal > * in the Software without restriction, including without limitation the rights > * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell > * copies of the Software, and to permit persons to whom the Software is > * furnished to do so, subject to the following conditions: > * > * The above copyright notice and this permission notice shall be included in > * all copies or substantial portions of the Software. > * > * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR > * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, > * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE > * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER > * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, > * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE > * SOFTWARE. > */ > > > #include "json.h" > #include > #include > > /* Common character classes. */ > > #define CASE_XDIGIT \ > case 'a': case 'b': case 'c': case 'd': case 'e': case 'f': \ > case 'A': case 'B': case 'C': case 'D': case 'E': case 'F' > > #define CASE_DIGIT \ > case '0': case '1': case '2': case '3': case '4': \ > case '5': case '6': case '7': case '8': case '9' > > /* Helper function to go from \uXXXX-encoded UTF-16 to UTF-8. */ > > static bool hex_to_utf8 (char *buf, char **dest, char *src) > { > int i, n; > uint8_t *p; > > for (i = n = 0; i < 4; i++) { > n <<= 4; > switch (src[i]) > { > CASE_DIGIT: n |= src[i] - '0'; break; > CASE_XDIGIT: n |= (src[i] & ~32) - 'A' + 10; break; > default: return false; > } > } > > p = (uint8_t *)*dest; > if (n < 128) { > *p++ = n; > } else if (n < 2048) { > *p++ = 0xC0 | (n >> 6); > *p++ = 0x80 | (n & 63); > } else if (n < 0xDC00 || n > 0xDFFF) { > *p++ = 0xE0 | (n >> 12); > *p++ = 0x80 | ((n >> 6) & 63); > *p++ = 0x80 | (n & 63); > } else { > /* Merge with preceding high surrogate. */ > if (p - (uint8_t *)buf < 3 > || p[-3] != 0xED > || p[-2] < 0xA0 || p[-2] > 0xAF) /* 0xD800..0xDBFF */ > return false; > > n += 0x10000 - 0xDC00; > n |= ((p[-2] & 15) << 16) | ((p[-1] & 63) << 10); > > /* Overwrite high surrogate. */ > p[-3] = 0xF0 | (n >> 18); > p[-2] = 0x80 | ((n >> 12) & 63); > p[-1] = 0x80 | ((n >> 6) & 63); > *p++ = 0x80 | (n & 63); > } > *dest = (char *)p; > return true; > } > > struct json_parser { > struct json_parser_config c; > size_t n, alloc; > char *buf; > size_t sp; > uint32_t state, stack[128]; > char start_buffer[4]; > }; > Having an explicit stack is unnecessary I think. You can use a very simple scheme to detect the end of messages by simply counting {}, [], and being aware of the lexical rules. Regards, Anthony Liguori