[Qemu-devel] [PATCH 0/4] Fix JSON string formatter

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

* [Qemu-devel] [PATCH 0/4] Fix JSON string formatter
@ 2013-03-14 17:49 Markus Armbruster
  2013-03-17 19:55 ` Blue Swirl
  2013-03-23 14:44 ` Blue Swirl
  0 siblings, 2 replies; 13+ messages in thread
From: Markus Armbruster @ 2013-03-14 17:49 UTC (permalink / raw)
  To: qemu-devel; +Cc: blauwirbel, aliguori

This should unbreak "make check" on machines where char is unsigned.
Blue, please give it a whirl.

The JSON parser is still as broken as ever.  Left for another day.

Markus Armbruster (4):
  unicode: New mod_utf8_codepoint()
  check-qjson: Fix up a few bogus comments
  check-qjson: Test noncharacters other than U+FFFE, U+FFFF in strings
  qjson: to_json() case QTYPE_QSTRING is buggy, rewrite

 include/qemu-common.h |   3 +
 qobject/qjson.c       | 102 ++++++++----------
 tests/check-qjson.c   | 280 +++++++++++++++++++++++++++++---------------------
 util/Makefile.objs    |   1 +
 util/unicode.c        |  96 +++++++++++++++++
 5 files changed, 306 insertions(+), 176 deletions(-)
 create mode 100644 util/unicode.c

-- 
1.7.11.7

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] Fix JSON string formatter
  2013-03-14 17:49 Markus Armbruster
@ 2013-03-17 19:55 ` Blue Swirl
  2013-03-18  9:58   ` Markus Armbruster
  2013-03-23 14:44 ` Blue Swirl
  1 sibling, 1 reply; 13+ messages in thread
From: Blue Swirl @ 2013-03-17 19:55 UTC (permalink / raw)
  To: Markus Armbruster; +Cc: aliguori, qemu-devel

On Thu, Mar 14, 2013 at 5:49 PM, Markus Armbruster <armbru@redhat.com> wrote:
> This should unbreak "make check" on machines where char is unsigned.
> Blue, please give it a whirl.

With the patches applied there are no errors, thanks.
Tested-by: Blue Swirl <blauwirbel@gmail.com>

Though test-coroutine seems to hang, maybe fallout from recent
coroutine changes.

>
> The JSON parser is still as broken as ever.  Left for another day.
>
> Markus Armbruster (4):
>   unicode: New mod_utf8_codepoint()
>   check-qjson: Fix up a few bogus comments
>   check-qjson: Test noncharacters other than U+FFFE, U+FFFF in strings
>   qjson: to_json() case QTYPE_QSTRING is buggy, rewrite
>
>  include/qemu-common.h |   3 +
>  qobject/qjson.c       | 102 ++++++++----------
>  tests/check-qjson.c   | 280 +++++++++++++++++++++++++++++---------------------
>  util/Makefile.objs    |   1 +
>  util/unicode.c        |  96 +++++++++++++++++
>  5 files changed, 306 insertions(+), 176 deletions(-)
>  create mode 100644 util/unicode.c
>
> --
> 1.7.11.7
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] Fix JSON string formatter
  2013-03-17 19:55 ` Blue Swirl
@ 2013-03-18  9:58   ` Markus Armbruster
  0 siblings, 0 replies; 13+ messages in thread
From: Markus Armbruster @ 2013-03-18  9:58 UTC (permalink / raw)
  To: Blue Swirl; +Cc: aliguori, qemu-devel

Blue Swirl <blauwirbel@gmail.com> writes:

> On Thu, Mar 14, 2013 at 5:49 PM, Markus Armbruster <armbru@redhat.com> wrote:
>> This should unbreak "make check" on machines where char is unsigned.
>> Blue, please give it a whirl.
>
> With the patches applied there are no errors, thanks.
> Tested-by: Blue Swirl <blauwirbel@gmail.com>

Thanks!

> Though test-coroutine seems to hang, maybe fallout from recent
> coroutine changes.

I've seen rtc-test hang intermittently; haven't gotten around to digging
for roots.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] Fix JSON string formatter
  2013-03-14 17:49 Markus Armbruster
  2013-03-17 19:55 ` Blue Swirl
@ 2013-03-23 14:44 ` Blue Swirl
  2013-04-11 16:12   ` Markus Armbruster
  1 sibling, 1 reply; 13+ messages in thread
From: Blue Swirl @ 2013-03-23 14:44 UTC (permalink / raw)
  To: Markus Armbruster; +Cc: aliguori, qemu-devel

On Thu, Mar 14, 2013 at 5:49 PM, Markus Armbruster <armbru@redhat.com> wrote:
> This should unbreak "make check" on machines where char is unsigned.
> Blue, please give it a whirl.

Patches no longer apply, please rebase.

>
> The JSON parser is still as broken as ever.  Left for another day.
>
> Markus Armbruster (4):
>   unicode: New mod_utf8_codepoint()
>   check-qjson: Fix up a few bogus comments
>   check-qjson: Test noncharacters other than U+FFFE, U+FFFF in strings
>   qjson: to_json() case QTYPE_QSTRING is buggy, rewrite
>
>  include/qemu-common.h |   3 +
>  qobject/qjson.c       | 102 ++++++++----------
>  tests/check-qjson.c   | 280 +++++++++++++++++++++++++++++---------------------
>  util/Makefile.objs    |   1 +
>  util/unicode.c        |  96 +++++++++++++++++
>  5 files changed, 306 insertions(+), 176 deletions(-)
>  create mode 100644 util/unicode.c
>
> --
> 1.7.11.7
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Qemu-devel] [PATCH 0/4] Fix JSON string formatter
@ 2013-04-11 16:07 Markus Armbruster
  2013-04-11 16:07 ` [Qemu-devel] [PATCH 1/4] unicode: New mod_utf8_codepoint() Markus Armbruster
                   ` (6 more replies)
  0 siblings, 7 replies; 13+ messages in thread
From: Markus Armbruster @ 2013-04-11 16:07 UTC (permalink / raw)
  To: qemu-devel; +Cc: blauwirbel, aliguori, lersek

This should unbreak "make check" on machines where char is unsigned.
Blue, please give it a whirl.

The JSON parser is still as broken as ever.  Left for another day.

v2:
- Rebased, trivial conflicts in PATCH 1/4.
- Make mod_utf8_codepoint() treat empty input as invalid sequence of
  length zero (both when n==0 and when n>0 && *s==0).  No code in this
  series passes empty input.
- Some commit messages and comments improved.

Markus Armbruster (4):
  unicode: New mod_utf8_codepoint()
  check-qjson: Improve a few comments, delete bogus ones
  check-qjson: Test noncharacters other than U+FFFE, U+FFFF in strings
  qjson: to_json() case QTYPE_QSTRING is buggy, rewrite

 include/qemu-common.h |   3 +
 qobject/qjson.c       | 102 ++++++++---------
 tests/check-qjson.c   | 308 ++++++++++++++++++++++++++++++--------------------
 util/Makefile.objs    |   2 +-
 util/unicode.c        | 100 ++++++++++++++++
 5 files changed, 333 insertions(+), 182 deletions(-)
 create mode 100644 util/unicode.c

-- 
1.7.11.7

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Qemu-devel] [PATCH 1/4] unicode: New mod_utf8_codepoint()
  2013-04-11 16:07 [Qemu-devel] [PATCH 0/4] Fix JSON string formatter Markus Armbruster
@ 2013-04-11 16:07 ` Markus Armbruster
  2013-04-11 16:07 ` [Qemu-devel] [PATCH 2/4] check-qjson: Improve a few comments, delete bogus ones Markus Armbruster
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 13+ messages in thread
From: Markus Armbruster @ 2013-04-11 16:07 UTC (permalink / raw)
  To: qemu-devel; +Cc: blauwirbel, aliguori, lersek


Signed-off-by: Markus Armbruster <armbru@redhat.com>
---
 include/qemu-common.h |   3 ++
 util/Makefile.objs    |   2 +-
 util/unicode.c        | 100 ++++++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 104 insertions(+), 1 deletion(-)
 create mode 100644 util/unicode.c

diff --git a/include/qemu-common.h b/include/qemu-common.h
index 31fff22..3b1873e 100644
--- a/include/qemu-common.h
+++ b/include/qemu-common.h
@@ -442,6 +442,9 @@ int64_t pow2floor(int64_t value);
 int uleb128_encode_small(uint8_t *out, uint32_t n);
 int uleb128_decode_small(const uint8_t *in, uint32_t *n);
 
+/* unicode.c */
+int mod_utf8_codepoint(const char *s, size_t n, char **end);
+
 /*
  * Hexdump a buffer to a file. An optional string prefix is added to every line
  */
diff --git a/util/Makefile.objs b/util/Makefile.objs
index 557bda7..c5652f5 100644
--- a/util/Makefile.objs
+++ b/util/Makefile.objs
@@ -1,4 +1,4 @@
-util-obj-y = osdep.o cutils.o qemu-timer-common.o
+util-obj-y = osdep.o cutils.o unicode.o qemu-timer-common.o
 util-obj-$(CONFIG_WIN32) += oslib-win32.o qemu-thread-win32.o event_notifier-win32.o
 util-obj-$(CONFIG_POSIX) += oslib-posix.o qemu-thread-posix.o event_notifier-posix.o
 util-obj-y += envlist.o path.o host-utils.o cache-utils.o module.o
diff --git a/util/unicode.c b/util/unicode.c
new file mode 100644
index 0000000..d1c8658
--- /dev/null
+++ b/util/unicode.c
@@ -0,0 +1,100 @@
+/*
+ * Dealing with Unicode
+ *
+ * Copyright (C) 2013 Red Hat, Inc.
+ *
+ * Authors:
+ *  Markus Armbruster <armbru@redhat.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or
+ * later.  See the COPYING file in the top-level directory.
+ */
+
+#include "qemu-common.h"
+
+/**
+ * mod_utf8_codepoint:
+ * @s: string encoded in modified UTF-8
+ * @n: maximum number of bytes to read from @s, if less than 6
+ * @end: set to end of sequence on return
+ *
+ * Convert the modified UTF-8 sequence at the start of @s.  Modified
+ * UTF-8 is exactly like UTF-8, except U+0000 is encoded as
+ * "\xC0\x80".
+ *
+ * If @n is zero or @s points to a zero byte, the sequence is invalid,
+ * and @end is set to @s.
+ *
+ * If @s points to an impossible byte (0xFE or 0xFF) or a continuation
+ * byte, the sequence is invalid, and @end is set to @s + 1
+ *
+ * Else, the first byte determines how many continuation bytes are
+ * expected.  If there are fewer, the sequence is invalid, and @end is
+ * set to @s + 1 + actual number of continuation bytes.  Else, the
+ * sequence is well-formed, and @end is set to @s + 1 + expected
+ * number of continuation bytes.
+ *
+ * A well-formed sequence is valid unless it encodes a codepoint
+ * outside the Unicode range U+0000..U+10FFFF, one of Unicode's 66
+ * noncharacters, a surrogate codepoint, or is overlong.  Except the
+ * overlong sequence "\xC0\x80" is valid.
+ *
+ * Conversion succeeds if and only if the sequence is valid.
+ *
+ * Returns: the Unicode codepoint on success, -1 on failure.
+ */
+int mod_utf8_codepoint(const char *s, size_t n, char **end)
+{
+    static int min_cp[5] = { 0x80, 0x800, 0x10000, 0x200000, 0x4000000 };
+    const unsigned char *p;
+    unsigned byte, mask, len, i;
+    int cp;
+
+    if (n == 0 || *s == 0) {
+        /* empty sequence */
+        *end = (char *)s;
+        return -1;
+    }
+
+    p = (const unsigned char *)s;
+    byte = *p++;
+    if (byte < 0x80) {
+        cp = byte;              /* one byte sequence */
+    } else if (byte >= 0xFE) {
+        cp = -1;                /* impossible bytes 0xFE, 0xFF */
+    } else if ((byte & 0x40) == 0) {
+        cp = -1;                /* unexpected continuation byte */
+    } else {
+        /* multi-byte sequence */
+        len = 0;
+        for (mask = 0x80; byte & mask; mask >>= 1) {
+            len++;
+        }
+        assert(len > 1 && len < 7);
+        cp = byte & (mask - 1);
+        for (i = 1; i < len; i++) {
+            byte = i < n ? *p : 0;
+            if ((byte & 0xC0) != 0x80) {
+                cp = -1;        /* continuation byte missing */
+                goto out;
+            }
+            p++;
+            cp <<= 6;
+            cp |= byte & 0x3F;
+        }
+        if (cp > 0x10FFFF) {
+            cp = -1;            /* beyond Unicode range */
+        } else if ((cp >= 0xFDD0 && cp <= 0xFDEF)
+                   || (cp & 0xFFFE) == 0xFFFE) {
+            cp = -1;            /* noncharacter */
+        } else if (cp >= 0xD800 && cp <= 0xDFFF) {
+            cp = -1;            /* surrogate code point */
+        } else if (cp < min_cp[len - 2] && !(cp == 0 && len == 2)) {
+            cp = -1;            /* overlong, not \xC0\x80 */
+        }
+    }
+
+out:
+    *end = (char *)p;
+    return cp;
+}
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [Qemu-devel] [PATCH 2/4] check-qjson: Improve a few comments, delete bogus ones
  2013-04-11 16:07 [Qemu-devel] [PATCH 0/4] Fix JSON string formatter Markus Armbruster
  2013-04-11 16:07 ` [Qemu-devel] [PATCH 1/4] unicode: New mod_utf8_codepoint() Markus Armbruster
@ 2013-04-11 16:07 ` Markus Armbruster
  2013-04-11 16:07 ` [Qemu-devel] [PATCH 3/4] check-qjson: Test noncharacters other than U+FFFE, U+FFFF in strings Markus Armbruster
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 13+ messages in thread
From: Markus Armbruster @ 2013-04-11 16:07 UTC (permalink / raw)
  To: qemu-devel; +Cc: blauwirbel, aliguori, lersek


Signed-off-by: Markus Armbruster <armbru@redhat.com>
---
 tests/check-qjson.c | 22 +++++++++++++---------
 1 file changed, 13 insertions(+), 9 deletions(-)

diff --git a/tests/check-qjson.c b/tests/check-qjson.c
index ec85a0c..91b4e5d 100644
--- a/tests/check-qjson.c
+++ b/tests/check-qjson.c
@@ -4,7 +4,7 @@
  *
  * Authors:
  *  Anthony Liguori   <aliguori@us.ibm.com>
- *  Markus Armbruster <armbru@redhat.com>,
+ *  Markus Armbruster <armbru@redhat.com>
  *
  * This work is licensed under the terms of the GNU LGPL, version 2.1 or later.
  * See the COPYING.LIB file in the top-level directory.
@@ -285,31 +285,31 @@ static void utf8_string(void)
         },
         /* 2.3  Other boundary conditions */
         {
-            /* U+D7FF */
+            /* last one before surrogate range: U+D7FF */
             "\"\xED\x9F\xBF\"",
             "\xED\x9F\xBF",
             "\"\\uD7FF\"",
         },
         {
-            /* U+E000 */
+            /* first one after surrogate range: U+E000 */
             "\"\xEE\x80\x80\"",
             "\xEE\x80\x80",
             "\"\\uE000\"",
         },
         {
-            /* U+FFFD */
+            /* last one in BMP: U+FFFD */
             "\"\xEF\xBF\xBD\"",
             "\xEF\xBF\xBD",
             "\"\\uFFFD\"",
         },
         {
-            /* U+10FFFF */
+            /* last one in last plane: U+10FFFF */
             "\"\xF4\x8F\xBF\xBF\"",
             "\xF4\x8F\xBF\xBF",
             "\"\\u43FF\\uFFFF\"", /* bug: want "\"\\uDBFF\\uDFFF\"" */
         },
         {
-            /* U+110000 */
+            /* first one beyond Unicode range: U+110000 */
             "\"\xF4\x90\x80\x80\"",
             "\xF4\x90\x80\x80",
             "\"\\u4400\\uFFFF\"", /* bug: want "\"\\uFFFF\"" */
@@ -462,8 +462,7 @@ static void utf8_string(void)
         },
         /* 3.3.4  5-byte sequence with last byte missing (U+0000) */
         {
-            /* invalid */
-            "\"\xF8\x80\x80\x80\"", /* bug: not corrected */
+            "\"\xF8\x80\x80\x80\"",
             NULL,                   /* bug: rejected */
             "\"\\u8000\\uFFFF\"",   /* bug: want "\"\\uFFFF\"" */
             "\xF8\x80\x80\x80",
@@ -570,7 +569,12 @@ static void utf8_string(void)
             "\"\\uC000\\uFFFF\\uFFFF\\uFFFF\"", /* bug: want "\"/\"" */
             "\xFC\x80\x80\x80\x80\xAF",
         },
-        /* 4.2  Maximum overlong sequences */
+        /*
+         * 4.2  Maximum overlong sequences
+         * Highest Unicode value that is still resulting in an
+         * overlong sequence if represented with the given number of
+         * bytes.  This is a boundary test for safe UTF-8 decoders.
+         */
         {
             /* \U+007F */
             "\"\xC1\xBF\"",
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [Qemu-devel] [PATCH 3/4] check-qjson: Test noncharacters other than U+FFFE, U+FFFF in strings
  2013-04-11 16:07 [Qemu-devel] [PATCH 0/4] Fix JSON string formatter Markus Armbruster
  2013-04-11 16:07 ` [Qemu-devel] [PATCH 1/4] unicode: New mod_utf8_codepoint() Markus Armbruster
  2013-04-11 16:07 ` [Qemu-devel] [PATCH 2/4] check-qjson: Improve a few comments, delete bogus ones Markus Armbruster
@ 2013-04-11 16:07 ` Markus Armbruster
  2013-04-11 16:07 ` [Qemu-devel] [PATCH 4/4] qjson: to_json() case QTYPE_QSTRING is buggy, rewrite Markus Armbruster
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 13+ messages in thread
From: Markus Armbruster @ 2013-04-11 16:07 UTC (permalink / raw)
  To: qemu-devel; +Cc: blauwirbel, aliguori, lersek

Test cases cover the two noncharacters in the BMP.  Add tests for the
other 64 noncharacters.

Three existing test cases involve noncharacters U+FFFF and U+10FFFF.
Instead of deleting them as now duplicates, adjust them to use U+FFFC
and U+10FFFFD.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
---
 tests/check-qjson.c | 96 ++++++++++++++++++++++++++++++++++++++++++++++-------
 1 file changed, 84 insertions(+), 12 deletions(-)

diff --git a/tests/check-qjson.c b/tests/check-qjson.c
index 91b4e5d..54074a9 100644
--- a/tests/check-qjson.c
+++ b/tests/check-qjson.c
@@ -158,7 +158,7 @@ static void utf8_string(void)
      * consider using overlong encoding \xC0\x80 for U+0000 ("modified
      * UTF-8").
      *
-     * Test cases are scraped from Markus Kuhn's UTF-8 decoder
+     * Most test cases are scraped from Markus Kuhn's UTF-8 decoder
      * capability and stress test at
      * http://www.cl.cam.ac.uk/~mgk25/ucs/examples/UTF-8-test.txt
      */
@@ -256,11 +256,19 @@ static void utf8_string(void)
             "\xDF\xBF",
             "\"\\u07FF\"",
         },
-        /* 2.2.3  3 bytes U+FFFF */
+        /*
+         * 2.2.3  3 bytes U+FFFC
+         * The last possible sequence is actually U+FFFF.  But that's
+         * a noncharacter, and already covered by its own test case
+         * under 5.3.  Same for U+FFFE.  U+FFFD is the last character
+         * in the BMP, and covered under 2.3.  Because of U+FFFD's
+         * special role as replacement character, it's worth testing
+         * U+FFFC here.
+         */
         {
-            "\"\xEF\xBF\xBF\"",
-            "\xEF\xBF\xBF",
-            "\"\\uFFFF\"",
+            "\"\xEF\xBF\xBC\"",
+            "\xEF\xBF\xBC",
+            "\"\\uFFFC\"",
         },
         /* 2.2.4  4 bytes U+1FFFFF */
         {
@@ -303,10 +311,10 @@ static void utf8_string(void)
             "\"\\uFFFD\"",
         },
         {
-            /* last one in last plane: U+10FFFF */
-            "\"\xF4\x8F\xBF\xBF\"",
-            "\xF4\x8F\xBF\xBF",
-            "\"\\u43FF\\uFFFF\"", /* bug: want "\"\\uDBFF\\uDFFF\"" */
+            /* last one in last plane: U+10FFFD */
+            "\"\xF4\x8F\xBF\xBD\"",
+            "\xF4\x8F\xBF\xBD",
+            "\"\\u43FF\\uFFFF\"", /* bug: want "\"\\uDBFF\\uDFFD\"" */
         },
         {
             /* first one beyond Unicode range: U+110000 */
@@ -589,9 +597,14 @@ static void utf8_string(void)
             "\"\\u07FF\"",
         },
         {
-            /* \U+FFFF */
-            "\"\xF0\x8F\xBF\xBF\"",
-            "\xF0\x8F\xBF\xBF",   /* bug: not corrected */
+            /*
+             * \U+FFFC
+             * The actual maximum would be U+FFFF, but that's a
+             * noncharacter.  Testing U+FFFC seems more useful.  See
+             * also 2.2.3
+             */
+            "\"\xF0\x8F\xBF\xBC\"",
+            "\xF0\x8F\xBF\xBC",   /* bug: not corrected */
             "\"\\u03FF\\uFFFF\"", /* bug: want "\"\\uFFFF\"" */
         },
         {
@@ -736,6 +749,7 @@ static void utf8_string(void)
             "\"\\uDBFF\\uDFFF\"", /* bug: want "\"\\uFFFF\\uFFFF\"" */
         },
         /* 5.3  Other illegal code positions */
+        /* BMP noncharacters */
         {
             /* \U+FFFE */
             "\"\xEF\xBF\xBE\"",
@@ -748,6 +762,64 @@ static void utf8_string(void)
             "\xEF\xBF\xBF",     /* bug: not corrected */
             "\"\\uFFFF\"",      /* bug: not corrected */
         },
+        {
+            /* U+FDD0 */
+            "\"\xEF\xB7\x90\"",
+            "\xEF\xB7\x90",     /* bug: not corrected */
+            "\"\\uFDD0\"",      /* bug: not corrected */
+        },
+        {
+            /* U+FDEF */
+            "\"\xEF\xB7\xAF\"",
+            "\xEF\xB7\xAF",     /* bug: not corrected */
+            "\"\\uFDEF\"",      /* bug: not corrected */
+        },
+        /* Plane 1 .. 16 noncharacters */
+        {
+            /* U+1FFFE U+1FFFF U+2FFFE U+2FFFF ... U+10FFFE U+10FFFF */
+            "\"\xF0\x9F\xBF\xBE\xF0\x9F\xBF\xBF"
+            "\xF0\xAF\xBF\xBE\xF0\xAF\xBF\xBF"
+            "\xF0\xBF\xBF\xBE\xF0\xBF\xBF\xBF"
+            "\xF1\x8F\xBF\xBE\xF1\x8F\xBF\xBF"
+            "\xF1\x9F\xBF\xBE\xF1\x9F\xBF\xBF"
+            "\xF1\xAF\xBF\xBE\xF1\xAF\xBF\xBF"
+            "\xF1\xBF\xBF\xBE\xF1\xBF\xBF\xBF"
+            "\xF2\x8F\xBF\xBE\xF2\x8F\xBF\xBF"
+            "\xF2\x9F\xBF\xBE\xF2\x9F\xBF\xBF"
+            "\xF2\xAF\xBF\xBE\xF2\xAF\xBF\xBF"
+            "\xF2\xBF\xBF\xBE\xF2\xBF\xBF\xBF"
+            "\xF3\x8F\xBF\xBE\xF3\x8F\xBF\xBF"
+            "\xF3\x9F\xBF\xBE\xF3\x9F\xBF\xBF"
+            "\xF3\xAF\xBF\xBE\xF3\xAF\xBF\xBF"
+            "\xF3\xBF\xBF\xBE\xF3\xBF\xBF\xBF"
+            "\xF4\x8F\xBF\xBE\xF4\x8F\xBF\xBF\"",
+            /* bug: not corrected */
+            "\xF0\x9F\xBF\xBE\xF0\x9F\xBF\xBF"
+            "\xF0\xAF\xBF\xBE\xF0\xAF\xBF\xBF"
+            "\xF0\xBF\xBF\xBE\xF0\xBF\xBF\xBF"
+            "\xF1\x8F\xBF\xBE\xF1\x8F\xBF\xBF"
+            "\xF1\x9F\xBF\xBE\xF1\x9F\xBF\xBF"
+            "\xF1\xAF\xBF\xBE\xF1\xAF\xBF\xBF"
+            "\xF1\xBF\xBF\xBE\xF1\xBF\xBF\xBF"
+            "\xF2\x8F\xBF\xBE\xF2\x8F\xBF\xBF"
+            "\xF2\x9F\xBF\xBE\xF2\x9F\xBF\xBF"
+            "\xF2\xAF\xBF\xBE\xF2\xAF\xBF\xBF"
+            "\xF2\xBF\xBF\xBE\xF2\xBF\xBF\xBF"
+            "\xF3\x8F\xBF\xBE\xF3\x8F\xBF\xBF"
+            "\xF3\x9F\xBF\xBE\xF3\x9F\xBF\xBF"
+            "\xF3\xAF\xBF\xBE\xF3\xAF\xBF\xBF"
+            "\xF3\xBF\xBF\xBE\xF3\xBF\xBF\xBF"
+            "\xF4\x8F\xBF\xBE\xF4\x8F\xBF\xBF",
+            /* bug: not corrected */
+            "\"\\u07FF\\uFFFF\\u07FF\\uFFFF\\u0BFF\\uFFFF\\u0BFF\\uFFFF"
+            "\\u0FFF\\uFFFF\\u0FFF\\uFFFF\\u13FF\\uFFFF\\u13FF\\uFFFF"
+            "\\u17FF\\uFFFF\\u17FF\\uFFFF\\u1BFF\\uFFFF\\u1BFF\\uFFFF"
+            "\\u1FFF\\uFFFF\\u1FFF\\uFFFF\\u23FF\\uFFFF\\u23FF\\uFFFF"
+            "\\u27FF\\uFFFF\\u27FF\\uFFFF\\u2BFF\\uFFFF\\u2BFF\\uFFFF"
+            "\\u2FFF\\uFFFF\\u2FFF\\uFFFF\\u33FF\\uFFFF\\u33FF\\uFFFF"
+            "\\u37FF\\uFFFF\\u37FF\\uFFFF\\u3BFF\\uFFFF\\u3BFF\\uFFFF"
+            "\\u3FFF\\uFFFF\\u3FFF\\uFFFF\\u43FF\\uFFFF\\u43FF\\uFFFF\"",
+        },
         {}
     };
     int i;
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [Qemu-devel] [PATCH 4/4] qjson: to_json() case QTYPE_QSTRING is buggy, rewrite
  2013-04-11 16:07 [Qemu-devel] [PATCH 0/4] Fix JSON string formatter Markus Armbruster
                   ` (2 preceding siblings ...)
  2013-04-11 16:07 ` [Qemu-devel] [PATCH 3/4] check-qjson: Test noncharacters other than U+FFFE, U+FFFF in strings Markus Armbruster
@ 2013-04-11 16:07 ` Markus Armbruster
  2013-04-11 16:11 ` [Qemu-devel] [PATCH 0/4] Fix JSON string formatter Markus Armbruster
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 13+ messages in thread
From: Markus Armbruster @ 2013-04-11 16:07 UTC (permalink / raw)
  To: qemu-devel; +Cc: blauwirbel, aliguori, lersek

Known bugs in to_json():

* A start byte for a three-byte sequence followed by less than two
  continuation bytes is split into one-byte sequences.

* Start bytes for sequences longer than three bytes get misinterpreted
  as start bytes for three-byte sequences.  Continuation bytes beyond
  byte three become one-byte sequences.

  This means all characters outside the BMP are decoded incorrectly.

* One-byte sequences with the MSB are put into the JSON string
  verbatim when char is unsigned, producing invalid UTF-8.  When char
  is signed, they're replaced by "\\uFFFF" instead.

  This includes \xFE, \xFF, and stray continuation bytes.

* Overlong sequences are happily accepted, unless screwed up by the
  bugs above.

* Likewise, sequences encoding surrogate code points or noncharacters.

* Unlike other control characters, ASCII DEL is not escaped.  Except
  in overlong encodings.

My rewrite fixes them as follows:

* Malformed UTF-8 sequences are replaced.

  Except the overlong encoding \xC0\x80 of U+0000 is still accepted.
  Permits embedding NUL characters in C strings.  This trick is known
  as "Modified UTF-8".

* Sequences encoding code points beyond Unicode range are replaced.

* Sequences encoding code points beyond the BMP produce a surrogate
  pair.

* Sequences encoding surrogate code points are replaced.

* Sequences encoding noncharacters are replaced.

* ASCII DEL is now always escaped.

The replacement character is U+FFFD.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
---
 qobject/qjson.c     | 102 +++++++++++--------------
 tests/check-qjson.c | 216 ++++++++++++++++++++++++----------------------------
 2 files changed, 145 insertions(+), 173 deletions(-)

diff --git a/qobject/qjson.c b/qobject/qjson.c
index 83a6b4f..19085a1 100644
--- a/qobject/qjson.c
+++ b/qobject/qjson.c
@@ -136,68 +136,56 @@ static void to_json(const QObject *obj, QString *str, int pretty, int indent)
     case QTYPE_QSTRING: {
         QString *val = qobject_to_qstring(obj);
         const char *ptr;
+        int cp;
+        char buf[16];
+        char *end;
 
         ptr = qstring_get_str(val);
         qstring_append(str, "\"");
-        while (*ptr) {
-            if ((ptr[0] & 0xE0) == 0xE0 &&
-                (ptr[1] & 0x80) && (ptr[2] & 0x80)) {
-                uint16_t wchar;
-                char escape[7];
-
-                wchar  = (ptr[0] & 0x0F) << 12;
-                wchar |= (ptr[1] & 0x3F) << 6;
-                wchar |= (ptr[2] & 0x3F);
-                ptr += 2;
-
-                snprintf(escape, sizeof(escape), "\\u%04X", wchar);
-                qstring_append(str, escape);
-            } else if ((ptr[0] & 0xE0) == 0xC0 && (ptr[1] & 0x80)) {
-                uint16_t wchar;
-                char escape[7];
-
-                wchar  = (ptr[0] & 0x1F) << 6;
-                wchar |= (ptr[1] & 0x3F);
-                ptr++;
-
-                snprintf(escape, sizeof(escape), "\\u%04X", wchar);
-                qstring_append(str, escape);
-            } else switch (ptr[0]) {
-                case '\"':
-                    qstring_append(str, "\\\"");
-                    break;
-                case '\\':
-                    qstring_append(str, "\\\\");
-                    break;
-                case '\b':
-                    qstring_append(str, "\\b");
-                    break;
-                case '\f':
-                    qstring_append(str, "\\f");
-                    break;
-                case '\n':
-                    qstring_append(str, "\\n");
-                    break;
-                case '\r':
-                    qstring_append(str, "\\r");
-                    break;
-                case '\t':
-                    qstring_append(str, "\\t");
-                    break;
-                default: {
-                    if (ptr[0] <= 0x1F) {
-                        char escape[7];
-                        snprintf(escape, sizeof(escape), "\\u%04X", ptr[0]);
-                        qstring_append(str, escape);
-                    } else {
-                        char buf[2] = { ptr[0], 0 };
-                        qstring_append(str, buf);
-                    }
-                    break;
+
+        for (; *ptr; ptr = end) {
+            cp = mod_utf8_codepoint(ptr, 6, &end);
+            switch (cp) {
+            case '\"':
+                qstring_append(str, "\\\"");
+                break;
+            case '\\':
+                qstring_append(str, "\\\\");
+                break;
+            case '\b':
+                qstring_append(str, "\\b");
+                break;
+            case '\f':
+                qstring_append(str, "\\f");
+                break;
+            case '\n':
+                qstring_append(str, "\\n");
+                break;
+            case '\r':
+                qstring_append(str, "\\r");
+                break;
+            case '\t':
+                qstring_append(str, "\\t");
+                break;
+            default:
+                if (cp < 0) {
+                    cp = 0xFFFD; /* replacement character */
                 }
+                if (cp > 0xFFFF) {
+                    /* beyond BMP; need a surrogate pair */
+                    snprintf(buf, sizeof(buf), "\\u%04X\\u%04X",
+                             0xD800 + ((cp - 0x10000) >> 10),
+                             0xDC00 + ((cp - 0x10000) & 0x3FF));
+                } else if (cp < 0x20 || cp >= 0x7F) {
+                    snprintf(buf, sizeof(buf), "\\u%04X", cp);
+                } else {
+                    buf[0] = cp;
+                    buf[1] = 0;
                 }
-            ptr++;
-        }
+                qstring_append(str, buf);
+            }
+        };
+
         qstring_append(str, "\"");
         break;
     }
diff --git a/tests/check-qjson.c b/tests/check-qjson.c
index 54074a9..4e74548 100644
--- a/tests/check-qjson.c
+++ b/tests/check-qjson.c
@@ -144,13 +144,10 @@ static void utf8_string(void)
      * The JSON parser rejects some invalid sequences, but accepts
      * others without correcting the problem.
      *
-     * The JSON formatter replaces some invalid sequences by U+FFFF (a
-     * noncharacter), and goes wonky for others.
-     *
-     * For both directions, we should either reject all invalid
-     * sequences, or minimize overlong sequences and replace all other
-     * invalid sequences by a suitable replacement character.  A
-     * common choice for replacement is U+FFFD.
+     * We should either reject all invalid sequences, or minimize
+     * overlong sequences and replace all other invalid sequences by a
+     * suitable replacement character.  A common choice for
+     * replacement is U+FFFD.
      *
      * Problem: we can't easily deal with embedded U+0000.  Parsing
      * the JSON string "this \\u0000" is fun" yields "this \0 is fun",
@@ -175,16 +172,10 @@ static void utf8_string(void)
          * - bug: rejected
          *   JSON parser rejects invalid sequence(s)
          *   We may choose to define this as feature
-         * - bug: want "\"...\""
-         *   JSON formatter produces incorrect result, this is the
-         *   correct one, assuming replacement character U+FFFF
-         * - bug: want "..." (no \")
+         * - bug: want "..."
          *   JSON parser produces incorrect result, this is the
          *   correct one, assuming replacement character U+FFFF
          *   We may choose to reject instead of replace
-         * Not marked explicitly, but trivial to find:
-         * - JSON formatter replacing invalid sequence by \\uFFFF is a
-         *   bug if we want it to fail for invalid sequences.
          */
 
         /* 1  Some correct UTF-8 text */
@@ -209,7 +200,8 @@ static void utf8_string(void)
         {
             "\"\\u0000\"",
             "",                 /* bug: want overlong "\xC0\x80" */
-            "\"\"",             /* bug: want "\"\\u0000\"" */
+            "\"\\u0000\"",
+            "\xC0\x80",
         },
         /* 2.1.2  2 bytes U+0080 */
         {
@@ -227,20 +219,20 @@ static void utf8_string(void)
         {
             "\"\xF0\x90\x80\x80\"",
             "\xF0\x90\x80\x80",
-            "\"\\u0400\\uFFFF\"", /* bug: want "\"\\uD800\\uDC00\"" */
+            "\"\\uD800\\uDC00\"",
         },
         /* 2.1.5  5 bytes U+200000 */
         {
             "\"\xF8\x88\x80\x80\x80\"",
-            NULL,                        /* bug: rejected */
-            "\"\\u8200\\uFFFF\\uFFFF\"", /* bug: want "\"\\uFFFF\"" */
+            NULL,               /* bug: rejected */
+            "\"\\uFFFD\"",
             "\xF8\x88\x80\x80\x80",
         },
         /* 2.1.6  6 bytes U+4000000 */
         {
             "\"\xFC\x84\x80\x80\x80\x80\"",
-            NULL,                               /* bug: rejected */
-            "\"\\uC100\\uFFFF\\uFFFF\\uFFFF\"", /* bug: want "\"\\uFFFF\"" */
+            NULL,               /* bug: rejected */
+            "\"\\uFFFD\"",
             "\xFC\x84\x80\x80\x80\x80",
         },
         /* 2.2  Last possible sequence of a certain length */
@@ -248,7 +240,7 @@ static void utf8_string(void)
         {
             "\"\x7F\"",
             "\x7F",
-            "\"\177\"",
+            "\"\\u007F\"",
         },
         /* 2.2.2  2 bytes U+07FF */
         {
@@ -273,22 +265,22 @@ static void utf8_string(void)
         /* 2.2.4  4 bytes U+1FFFFF */
         {
             "\"\xF7\xBF\xBF\xBF\"",
-            NULL,                 /* bug: rejected */
-            "\"\\u7FFF\\uFFFF\"", /* bug: want "\"\\uFFFF\"" */
+            NULL,               /* bug: rejected */
+            "\"\\uFFFD\"",
             "\xF7\xBF\xBF\xBF",
         },
         /* 2.2.5  5 bytes U+3FFFFFF */
         {
             "\"\xFB\xBF\xBF\xBF\xBF\"",
-            NULL,                        /* bug: rejected */
-            "\"\\uBFFF\\uFFFF\\uFFFF\"", /* bug: want "\"\\uFFFF\"" */
+            NULL,               /* bug: rejected */
+            "\"\\uFFFD\"",
             "\xFB\xBF\xBF\xBF\xBF",
         },
         /* 2.2.6  6 bytes U+7FFFFFFF */
         {
             "\"\xFD\xBF\xBF\xBF\xBF\xBF\"",
-            NULL,                               /* bug: rejected */
-            "\"\\uDFFF\\uFFFF\\uFFFF\\uFFFF\"", /* bug: want "\"\\uFFFF\"" */
+            NULL,               /* bug: rejected */
+            "\"\\uFFFD\"",
             "\xFD\xBF\xBF\xBF\xBF\xBF",
         },
         /* 2.3  Other boundary conditions */
@@ -314,13 +306,13 @@ static void utf8_string(void)
             /* last one in last plane: U+10FFFD */
             "\"\xF4\x8F\xBF\xBD\"",
             "\xF4\x8F\xBF\xBD",
-            "\"\\u43FF\\uFFFF\"", /* bug: want "\"\\uDBFF\\uDFFD\"" */
+            "\"\\uDBFF\\uDFFD\""
         },
         {
             /* first one beyond Unicode range: U+110000 */
             "\"\xF4\x90\x80\x80\"",
             "\xF4\x90\x80\x80",
-            "\"\\u4400\\uFFFF\"", /* bug: want "\"\\uFFFF\"" */
+            "\"\\uFFFD\"",
         },
         /* 3  Malformed sequences */
         /* 3.1  Unexpected continuation bytes */
@@ -328,49 +320,49 @@ static void utf8_string(void)
         {
             "\"\x80\"",
             "\x80",             /* bug: not corrected */
-            "\"\\uFFFF\"",
+            "\"\\uFFFD\"",
         },
         /* 3.1.2  Last continuation byte */
         {
             "\"\xBF\"",
             "\xBF",             /* bug: not corrected */
-            "\"\\uFFFF\"",
+            "\"\\uFFFD\"",
         },
         /* 3.1.3  2 continuation bytes */
         {
             "\"\x80\xBF\"",
             "\x80\xBF",         /* bug: not corrected */
-            "\"\\uFFFF\\uFFFF\"",
+            "\"\\uFFFD\\uFFFD\"",
         },
         /* 3.1.4  3 continuation bytes */
         {
             "\"\x80\xBF\x80\"",
             "\x80\xBF\x80",     /* bug: not corrected */
-            "\"\\uFFFF\\uFFFF\\uFFFF\"",
+            "\"\\uFFFD\\uFFFD\\uFFFD\"",
         },
         /* 3.1.5  4 continuation bytes */
         {
             "\"\x80\xBF\x80\xBF\"",
             "\x80\xBF\x80\xBF", /* bug: not corrected */
-            "\"\\uFFFF\\uFFFF\\uFFFF\\uFFFF\"",
+            "\"\\uFFFD\\uFFFD\\uFFFD\\uFFFD\"",
         },
         /* 3.1.6  5 continuation bytes */
         {
             "\"\x80\xBF\x80\xBF\x80\"",
             "\x80\xBF\x80\xBF\x80", /* bug: not corrected */
-            "\"\\uFFFF\\uFFFF\\uFFFF\\uFFFF\\uFFFF\"",
+            "\"\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\"",
         },
         /* 3.1.7  6 continuation bytes */
         {
             "\"\x80\xBF\x80\xBF\x80\xBF\"",
             "\x80\xBF\x80\xBF\x80\xBF", /* bug: not corrected */
-            "\"\\uFFFF\\uFFFF\\uFFFF\\uFFFF\\uFFFF\\uFFFF\"",
+            "\"\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\"",
         },
         /* 3.1.8  7 continuation bytes */
         {
             "\"\x80\xBF\x80\xBF\x80\xBF\x80\"",
             "\x80\xBF\x80\xBF\x80\xBF\x80", /* bug: not corrected */
-            "\"\\uFFFF\\uFFFF\\uFFFF\\uFFFF\\uFFFF\\uFFFF\\uFFFF\"",
+            "\"\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\"",
         },
         /* 3.1.9  Sequence of all 64 possible continuation bytes */
         {
@@ -391,14 +383,14 @@ static void utf8_string(void)
             "\xA8\xA9\xAA\xAB\xAC\xAD\xAE\xAF"
             "\xB0\xB1\xB2\xB3\xB4\xB5\xB6\xB7"
             "\xB8\xB9\xBA\xBB\xBC\xBD\xBE\xBF",
-            "\"\\uFFFF\\uFFFF\\uFFFF\\uFFFF\\uFFFF\\uFFFF\\uFFFF\\uFFFF"
-            "\\uFFFF\\uFFFF\\uFFFF\\uFFFF\\uFFFF\\uFFFF\\uFFFF\\uFFFF"
-            "\\uFFFF\\uFFFF\\uFFFF\\uFFFF\\uFFFF\\uFFFF\\uFFFF\\uFFFF"
-            "\\uFFFF\\uFFFF\\uFFFF\\uFFFF\\uFFFF\\uFFFF\\uFFFF\\uFFFF"
-            "\\uFFFF\\uFFFF\\uFFFF\\uFFFF\\uFFFF\\uFFFF\\uFFFF\\uFFFF"
-            "\\uFFFF\\uFFFF\\uFFFF\\uFFFF\\uFFFF\\uFFFF\\uFFFF\\uFFFF"
-            "\\uFFFF\\uFFFF\\uFFFF\\uFFFF\\uFFFF\\uFFFF\\uFFFF\\uFFFF"
-            "\\uFFFF\\uFFFF\\uFFFF\\uFFFF\\uFFFF\\uFFFF\\uFFFF\\uFFFF\""
+            "\"\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD"
+            "\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD"
+            "\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD"
+            "\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD"
+            "\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD"
+            "\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD"
+            "\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD"
+            "\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\""
         },
         /* 3.2  Lonely start characters */
         /* 3.2.1  All 32 first bytes of 2-byte sequences, followed by space */
@@ -408,10 +400,10 @@ static void utf8_string(void)
             "\xD0 \xD1 \xD2 \xD3 \xD4 \xD5 \xD6 \xD7 "
             "\xD8 \xD9 \xDA \xDB \xDC \xDD \xDE \xDF \"",
             NULL,               /* bug: rejected */
-            "\"\\uFFFF \\uFFFF \\uFFFF \\uFFFF \\uFFFF \\uFFFF \\uFFFF \\uFFFF "
-            "\\uFFFF \\uFFFF \\uFFFF \\uFFFF \\uFFFF \\uFFFF \\uFFFF \\uFFFF "
-            "\\uFFFF \\uFFFF \\uFFFF \\uFFFF \\uFFFF \\uFFFF \\uFFFF \\uFFFF "
-            "\\uFFFF \\uFFFF \\uFFFF \\uFFFF \\uFFFF \\uFFFF \\uFFFF \\uFFFF \"",
+            "\"\\uFFFD \\uFFFD \\uFFFD \\uFFFD \\uFFFD \\uFFFD \\uFFFD \\uFFFD "
+            "\\uFFFD \\uFFFD \\uFFFD \\uFFFD \\uFFFD \\uFFFD \\uFFFD \\uFFFD "
+            "\\uFFFD \\uFFFD \\uFFFD \\uFFFD \\uFFFD \\uFFFD \\uFFFD \\uFFFD "
+            "\\uFFFD \\uFFFD \\uFFFD \\uFFFD \\uFFFD \\uFFFD \\uFFFD \\uFFFD \"",
             "\xC0 \xC1 \xC2 \xC3 \xC4 \xC5 \xC6 \xC7 "
             "\xC8 \xC9 \xCA \xCB \xCC \xCD \xCE \xCF "
             "\xD0 \xD1 \xD2 \xD3 \xD4 \xD5 \xD6 \xD7 "
@@ -424,28 +416,28 @@ static void utf8_string(void)
             /* bug: not corrected */
             "\xE0 \xE1 \xE2 \xE3 \xE4 \xE5 \xE6 \xE7 "
             "\xE8 \xE9 \xEA \xEB \xEC \xED \xEE \xEF ",
-            "\"\\uFFFF \\uFFFF \\uFFFF \\uFFFF \\uFFFF \\uFFFF \\uFFFF \\uFFFF "
-            "\\uFFFF \\uFFFF \\uFFFF \\uFFFF \\uFFFF \\uFFFF \\uFFFF \\uFFFF \"",
+            "\"\\uFFFD \\uFFFD \\uFFFD \\uFFFD \\uFFFD \\uFFFD \\uFFFD \\uFFFD "
+            "\\uFFFD \\uFFFD \\uFFFD \\uFFFD \\uFFFD \\uFFFD \\uFFFD \\uFFFD \"",
         },
         /* 3.2.3  All 8 first bytes of 4-byte sequences, followed by space */
         {
             "\"\xF0 \xF1 \xF2 \xF3 \xF4 \xF5 \xF6 \xF7 \"",
             NULL,               /* bug: rejected */
-            "\"\\uFFFF \\uFFFF \\uFFFF \\uFFFF \\uFFFF \\uFFFF \\uFFFF \\uFFFF \"",
+            "\"\\uFFFD \\uFFFD \\uFFFD \\uFFFD \\uFFFD \\uFFFD \\uFFFD \\uFFFD \"",
             "\xF0 \xF1 \xF2 \xF3 \xF4 \xF5 \xF6 \xF7 ",
         },
         /* 3.2.4  All 4 first bytes of 5-byte sequences, followed by space */
         {
             "\"\xF8 \xF9 \xFA \xFB \"",
             NULL,               /* bug: rejected */
-            "\"\\uFFFF \\uFFFF \\uFFFF \\uFFFF \"",
+            "\"\\uFFFD \\uFFFD \\uFFFD \\uFFFD \"",
             "\xF8 \xF9 \xFA \xFB ",
         },
         /* 3.2.5  All 2 first bytes of 6-byte sequences, followed by space */
         {
             "\"\xFC \xFD \"",
             NULL,               /* bug: rejected */
-            "\"\\uFFFF \\uFFFF \"",
+            "\"\\uFFFD \\uFFFD \"",
             "\xFC \xFD ",
         },
         /* 3.3  Sequences with last continuation byte missing */
@@ -453,66 +445,66 @@ static void utf8_string(void)
         {
             "\"\xC0\"",
             NULL,               /* bug: rejected */
-            "\"\\uFFFF\"",
+            "\"\\uFFFD\"",
             "\xC0",
         },
         /* 3.3.2  3-byte sequence with last byte missing (U+0000) */
         {
             "\"\xE0\x80\"",
             "\xE0\x80",           /* bug: not corrected */
-            "\"\\uFFFF\\uFFFF\"", /* bug: want "\"\\uFFFF\"" */
+            "\"\\uFFFD\"",
         },
         /* 3.3.3  4-byte sequence with last byte missing (U+0000) */
         {
             "\"\xF0\x80\x80\"",
             "\xF0\x80\x80",     /* bug: not corrected */
-            "\"\\u0000\"",      /* bug: want "\"\\uFFFF\"" */
+            "\"\\uFFFD\"",
         },
         /* 3.3.4  5-byte sequence with last byte missing (U+0000) */
         {
             "\"\xF8\x80\x80\x80\"",
             NULL,                   /* bug: rejected */
-            "\"\\u8000\\uFFFF\"",   /* bug: want "\"\\uFFFF\"" */
+            "\"\\uFFFD\"",
             "\xF8\x80\x80\x80",
         },
         /* 3.3.5  6-byte sequence with last byte missing (U+0000) */
         {
             "\"\xFC\x80\x80\x80\x80\"",
             NULL,                        /* bug: rejected */
-            "\"\\uC000\\uFFFF\\uFFFF\"", /* bug: want "\"\\uFFFF\"" */
+            "\"\\uFFFD\"",
             "\xFC\x80\x80\x80\x80",
         },
         /* 3.3.6  2-byte sequence with last byte missing (U+07FF) */
         {
             "\"\xDF\"",
             "\xDF",             /* bug: not corrected */
-            "\"\\uFFFF\"",
+            "\"\\uFFFD\"",
         },
         /* 3.3.7  3-byte sequence with last byte missing (U+FFFF) */
         {
             "\"\xEF\xBF\"",
             "\xEF\xBF",           /* bug: not corrected */
-            "\"\\uFFFF\\uFFFF\"", /* bug: want "\"\\uFFFF\"" */
+            "\"\\uFFFD\"",
         },
         /* 3.3.8  4-byte sequence with last byte missing (U+1FFFFF) */
         {
             "\"\xF7\xBF\xBF\"",
             NULL,               /* bug: rejected */
-            "\"\\u7FFF\"",      /* bug: want "\"\\uFFFF\"" */
+            "\"\\uFFFD\"",
             "\xF7\xBF\xBF",
         },
         /* 3.3.9  5-byte sequence with last byte missing (U+3FFFFFF) */
         {
             "\"\xFB\xBF\xBF\xBF\"",
             NULL,                 /* bug: rejected */
-            "\"\\uBFFF\\uFFFF\"", /* bug: want "\"\\uFFFF\"" */
+            "\"\\uFFFD\"",
             "\xFB\xBF\xBF\xBF",
         },
         /* 3.3.10  6-byte sequence with last byte missing (U+7FFFFFFF) */
         {
             "\"\xFD\xBF\xBF\xBF\xBF\"",
             NULL,                        /* bug: rejected */
-            "\"\\uDFFF\\uFFFF\\uFFFF\"", /* bug: want "\"\\uFFFF\"", */
+            "\"\\uFFFD\"",
             "\xFD\xBF\xBF\xBF\xBF",
         },
         /* 3.4  Concatenation of incomplete sequences */
@@ -520,10 +512,8 @@ static void utf8_string(void)
             "\"\xC0\xE0\x80\xF0\x80\x80\xF8\x80\x80\x80\xFC\x80\x80\x80\x80"
             "\xDF\xEF\xBF\xF7\xBF\xBF\xFB\xBF\xBF\xBF\xFD\xBF\xBF\xBF\xBF\"",
             NULL,               /* bug: rejected */
-            /* bug: want "\"\\uFFFF\\uFFFF\\uFFFF\\uFFFF\\uFFFF"
-               "\\uFFFF\\uFFFF\\uFFFF\\uFFFF\\uFFFF\"" */
-            "\"\\u0020\\uFFFF\\u0000\\u8000\\uFFFF\\uC000\\uFFFF\\uFFFF"
-            "\\u07EF\\uFFFF\\u7FFF\\uBFFF\\uFFFF\\uDFFF\\uFFFF\\uFFFF\"",
+            "\"\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD"
+            "\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\"",
             "\xC0\xE0\x80\xF0\x80\x80\xF8\x80\x80\x80\xFC\x80\x80\x80\x80"
             "\xDF\xEF\xBF\xF7\xBF\xBF\xFB\xBF\xBF\xBF\xFD\xBF\xBF\xBF\xBF",
         },
@@ -531,20 +521,19 @@ static void utf8_string(void)
         {
             "\"\xFE\"",
             NULL,               /* bug: rejected */
-            "\"\\uFFFF\"",
+            "\"\\uFFFD\"",
             "\xFE",
         },
         {
             "\"\xFF\"",
             NULL,               /* bug: rejected */
-            "\"\\uFFFF\"",
+            "\"\\uFFFD\"",
             "\xFF",
         },
         {
             "\"\xFE\xFE\xFF\xFF\"",
             NULL,                 /* bug: rejected */
-            /* bug: want "\"\\uFFFF\\uFFFF\\uFFFF\\uFFFF\"" */
-            "\"\\uEFBF\\uFFFF\"",
+            "\"\\uFFFD\\uFFFD\\uFFFD\\uFFFD\"",
             "\xFE\xFE\xFF\xFF",
         },
         /* 4  Overlong sequences */
@@ -552,29 +541,29 @@ static void utf8_string(void)
         {
             "\"\xC0\xAF\"",
             NULL,               /* bug: rejected */
-            "\"\\u002F\"",      /* bug: want "\"/\"" */
+            "\"\\uFFFD\"",
             "\xC0\xAF",
         },
         {
             "\"\xE0\x80\xAF\"",
             "\xE0\x80\xAF",     /* bug: not corrected */
-            "\"\\u002F\"",      /* bug: want "\"/\"" */
+            "\"\\uFFFD\"",
         },
         {
             "\"\xF0\x80\x80\xAF\"",
             "\xF0\x80\x80\xAF",  /* bug: not corrected */
-            "\"\\u0000\\uFFFF\"" /* bug: want "\"/\"" */
+            "\"\\uFFFD\"",
         },
         {
             "\"\xF8\x80\x80\x80\xAF\"",
             NULL,                        /* bug: rejected */
-            "\"\\u8000\\uFFFF\\uFFFF\"", /* bug: want "\"/\"" */
+            "\"\\uFFFD\"",
             "\xF8\x80\x80\x80\xAF",
         },
         {
             "\"\xFC\x80\x80\x80\x80\xAF\"",
             NULL,                               /* bug: rejected */
-            "\"\\uC000\\uFFFF\\uFFFF\\uFFFF\"", /* bug: want "\"/\"" */
+            "\"\\uFFFD\"",
             "\xFC\x80\x80\x80\x80\xAF",
         },
         /*
@@ -587,14 +576,14 @@ static void utf8_string(void)
             /* \U+007F */
             "\"\xC1\xBF\"",
             NULL,               /* bug: rejected */
-            "\"\\u007F\"",      /* bug: want "\"\177\"" */
+            "\"\\uFFFD\"",
             "\xC1\xBF",
         },
         {
             /* \U+07FF */
             "\"\xE0\x9F\xBF\"",
             "\xE0\x9F\xBF",     /* bug: not corrected */
-            "\"\\u07FF\"",
+            "\"\\uFFFD\"",
         },
         {
             /*
@@ -605,20 +594,20 @@ static void utf8_string(void)
              */
             "\"\xF0\x8F\xBF\xBC\"",
             "\xF0\x8F\xBF\xBC",   /* bug: not corrected */
-            "\"\\u03FF\\uFFFF\"", /* bug: want "\"\\uFFFF\"" */
+            "\"\\uFFFD\"",
         },
         {
             /* \U+1FFFFF */
             "\"\xF8\x87\xBF\xBF\xBF\"",
             NULL,                        /* bug: rejected */
-            "\"\\u81FF\\uFFFF\\uFFFF\"", /* bug: want "\"\\uFFFF\"" */
+            "\"\\uFFFD\"",
             "\xF8\x87\xBF\xBF\xBF",
         },
         {
             /* \U+3FFFFFF */
             "\"\xFC\x83\xBF\xBF\xBF\xBF\"",
             NULL,                               /* bug: rejected */
-            "\"\\uC0FF\\uFFFF\\uFFFF\\uFFFF\"", /* bug: want "\"\\uFFFF\"" */
+            "\"\\uFFFD\"",
             "\xFC\x83\xBF\xBF\xBF\xBF",
         },
         /* 4.3  Overlong representation of the NUL character */
@@ -633,26 +622,26 @@ static void utf8_string(void)
             /* \U+0000 */
             "\"\xE0\x80\x80\"",
             "\xE0\x80\x80",     /* bug: not corrected */
-            "\"\\u0000\"",
+            "\"\\uFFFD\"",
         },
         {
             /* \U+0000 */
             "\"\xF0\x80\x80\x80\"",
             "\xF0\x80\x80\x80",   /* bug: not corrected */
-            "\"\\u0000\\uFFFF\"", /* bug: want "\"\\u0000\"" */
+            "\"\\uFFFD\"",
         },
         {
             /* \U+0000 */
             "\"\xF8\x80\x80\x80\x80\"",
             NULL,                        /* bug: rejected */
-            "\"\\u8000\\uFFFF\\uFFFF\"", /* bug: want "\"\\u0000\"" */
+            "\"\\uFFFD\"",
             "\xF8\x80\x80\x80\x80",
         },
         {
             /* \U+0000 */
             "\"\xFC\x80\x80\x80\x80\x80\"",
             NULL,                               /* bug: rejected */
-            "\"\\uC000\\uFFFF\\uFFFF\\uFFFF\"", /* bug: want "\"\\u0000\"" */
+            "\"\\uFFFD\"",
             "\xFC\x80\x80\x80\x80\x80",
         },
         /* 5  Illegal code positions */
@@ -661,92 +650,92 @@ static void utf8_string(void)
             /* \U+D800 */
             "\"\xED\xA0\x80\"",
             "\xED\xA0\x80",     /* bug: not corrected */
-            "\"\\uD800\"",      /* bug: want "\"\\uFFFF\"" */
+            "\"\\uFFFD\"",
         },
         {
             /* \U+DB7F */
             "\"\xED\xAD\xBF\"",
             "\xED\xAD\xBF",     /* bug: not corrected */
-            "\"\\uDB7F\"",      /* bug: want "\"\\uFFFF\"" */
+            "\"\\uFFFD\"",
         },
         {
             /* \U+DB80 */
             "\"\xED\xAE\x80\"",
             "\xED\xAE\x80",     /* bug: not corrected */
-            "\"\\uDB80\"",      /* bug: want "\"\\uFFFF\"" */
+            "\"\\uFFFD\"",
         },
         {
             /* \U+DBFF */
             "\"\xED\xAF\xBF\"",
             "\xED\xAF\xBF",     /* bug: not corrected */
-            "\"\\uDBFF\"",      /* bug: want "\"\\uFFFF\"" */
+            "\"\\uFFFD\"",
         },
         {
             /* \U+DC00 */
             "\"\xED\xB0\x80\"",
             "\xED\xB0\x80",     /* bug: not corrected */
-            "\"\\uDC00\"",      /* bug: want "\"\\uFFFF\"" */
+            "\"\\uFFFD\"",
         },
         {
             /* \U+DF80 */
             "\"\xED\xBE\x80\"",
             "\xED\xBE\x80",     /* bug: not corrected */
-            "\"\\uDF80\"",      /* bug: want "\"\\uFFFF\"" */
+            "\"\\uFFFD\"",
         },
         {
             /* \U+DFFF */
             "\"\xED\xBF\xBF\"",
             "\xED\xBF\xBF",     /* bug: not corrected */
-            "\"\\uDFFF\"",      /* bug: want "\"\\uFFFF\"" */
+            "\"\\uFFFD\"",
         },
         /* 5.2  Paired UTF-16 surrogates */
         {
             /* \U+D800\U+DC00 */
             "\"\xED\xA0\x80\xED\xB0\x80\"",
             "\xED\xA0\x80\xED\xB0\x80", /* bug: not corrected */
-            "\"\\uD800\\uDC00\"", /* bug: want "\"\\uFFFF\\uFFFF\"" */
+            "\"\\uFFFD\\uFFFD\"",
         },
         {
             /* \U+D800\U+DFFF */
             "\"\xED\xA0\x80\xED\xBF\xBF\"",
             "\xED\xA0\x80\xED\xBF\xBF", /* bug: not corrected */
-            "\"\\uD800\\uDFFF\"", /* bug: want "\"\\uFFFF\\uFFFF\"" */
+            "\"\\uFFFD\\uFFFD\"",
         },
         {
             /* \U+DB7F\U+DC00 */
             "\"\xED\xAD\xBF\xED\xB0\x80\"",
             "\xED\xAD\xBF\xED\xB0\x80", /* bug: not corrected */
-            "\"\\uDB7F\\uDC00\"", /* bug: want "\"\\uFFFF\\uFFFF\"" */
+            "\"\\uFFFD\\uFFFD\"",
         },
         {
             /* \U+DB7F\U+DFFF */
             "\"\xED\xAD\xBF\xED\xBF\xBF\"",
             "\xED\xAD\xBF\xED\xBF\xBF", /* bug: not corrected */
-            "\"\\uDB7F\\uDFFF\"", /* bug: want "\"\\uFFFF\\uFFFF\"" */
+            "\"\\uFFFD\\uFFFD\"",
         },
         {
             /* \U+DB80\U+DC00 */
             "\"\xED\xAE\x80\xED\xB0\x80\"",
             "\xED\xAE\x80\xED\xB0\x80", /* bug: not corrected */
-            "\"\\uDB80\\uDC00\"", /* bug: want "\"\\uFFFF\\uFFFF\"" */
+            "\"\\uFFFD\\uFFFD\"",
         },
         {
             /* \U+DB80\U+DFFF */
             "\"\xED\xAE\x80\xED\xBF\xBF\"",
             "\xED\xAE\x80\xED\xBF\xBF", /* bug: not corrected */
-            "\"\\uDB80\\uDFFF\"", /* bug: want "\"\\uFFFF\\uFFFF\"" */
+            "\"\\uFFFD\\uFFFD\"",
         },
         {
             /* \U+DBFF\U+DC00 */
             "\"\xED\xAF\xBF\xED\xB0\x80\"",
             "\xED\xAF\xBF\xED\xB0\x80", /* bug: not corrected */
-            "\"\\uDBFF\\uDC00\"", /* bug: want "\"\\uFFFF\\uFFFF\"" */
+            "\"\\uFFFD\\uFFFD\"",
         },
         {
             /* \U+DBFF\U+DFFF */
             "\"\xED\xAF\xBF\xED\xBF\xBF\"",
             "\xED\xAF\xBF\xED\xBF\xBF", /* bug: not corrected */
-            "\"\\uDBFF\\uDFFF\"", /* bug: want "\"\\uFFFF\\uFFFF\"" */
+            "\"\\uFFFD\\uFFFD\"",
         },
         /* 5.3  Other illegal code positions */
         /* BMP noncharacters */
@@ -754,25 +743,25 @@ static void utf8_string(void)
             /* \U+FFFE */
             "\"\xEF\xBF\xBE\"",
             "\xEF\xBF\xBE",     /* bug: not corrected */
-            "\"\\uFFFE\"",      /* bug: not corrected */
+            "\"\\uFFFD\"",
         },
         {
             /* \U+FFFF */
             "\"\xEF\xBF\xBF\"",
             "\xEF\xBF\xBF",     /* bug: not corrected */
-            "\"\\uFFFF\"",      /* bug: not corrected */
+            "\"\\uFFFD\"",
         },
         {
             /* U+FDD0 */
             "\"\xEF\xB7\x90\"",
             "\xEF\xB7\x90",     /* bug: not corrected */
-            "\"\\uFDD0\"",      /* bug: not corrected */
+            "\"\\uFFFD\"",
         },
         {
             /* U+FDEF */
             "\"\xEF\xB7\xAF\"",
             "\xEF\xB7\xAF",     /* bug: not corrected */
-            "\"\\uFDEF\"",      /* bug: not corrected */
+            "\"\\uFFFD\"",
         },
         /* Plane 1 .. 16 noncharacters */
         {
@@ -810,15 +799,10 @@ static void utf8_string(void)
             "\xF3\xAF\xBF\xBE\xF3\xAF\xBF\xBF"
             "\xF3\xBF\xBF\xBE\xF3\xBF\xBF\xBF"
             "\xF4\x8F\xBF\xBE\xF4\x8F\xBF\xBF",
-            /* bug: not corrected */
-            "\"\\u07FF\\uFFFF\\u07FF\\uFFFF\\u0BFF\\uFFFF\\u0BFF\\uFFFF"
-            "\\u0FFF\\uFFFF\\u0FFF\\uFFFF\\u13FF\\uFFFF\\u13FF\\uFFFF"
-            "\\u17FF\\uFFFF\\u17FF\\uFFFF\\u1BFF\\uFFFF\\u1BFF\\uFFFF"
-            "\\u1FFF\\uFFFF\\u1FFF\\uFFFF\\u23FF\\uFFFF\\u23FF\\uFFFF"
-            "\\u27FF\\uFFFF\\u27FF\\uFFFF\\u2BFF\\uFFFF\\u2BFF\\uFFFF"
-            "\\u2FFF\\uFFFF\\u2FFF\\uFFFF\\u33FF\\uFFFF\\u33FF\\uFFFF"
-            "\\u37FF\\uFFFF\\u37FF\\uFFFF\\u3BFF\\uFFFF\\u3BFF\\uFFFF"
-            "\\u3FFF\\uFFFF\\u3FFF\\uFFFF\\u43FF\\uFFFF\\u43FF\\uFFFF\"",
+            "\"\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD"
+            "\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD"
+            "\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD"
+            "\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\"",
         },
         {}
     };
@@ -856,8 +840,8 @@ static void utf8_string(void)
         qobject_decref(obj);
 
         /*
-         * Disabled, because json_out currently contains the crap
-         * qobject_to_json() produces.
+         * Disabled, because qobject_from_json() is buggy, and I can't
+         * be bothered to add the expected incorrect results.
          * FIXME Enable once these bugs have been fixed.
          */
         if (0 && json_out != json_in) {
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] Fix JSON string formatter
  2013-04-11 16:07 [Qemu-devel] [PATCH 0/4] Fix JSON string formatter Markus Armbruster
                   ` (3 preceding siblings ...)
  2013-04-11 16:07 ` [Qemu-devel] [PATCH 4/4] qjson: to_json() case QTYPE_QSTRING is buggy, rewrite Markus Armbruster
@ 2013-04-11 16:11 ` Markus Armbruster
  2013-04-11 17:03 ` Laszlo Ersek
  2013-04-13 19:54 ` Blue Swirl
  6 siblings, 0 replies; 13+ messages in thread
From: Markus Armbruster @ 2013-04-11 16:11 UTC (permalink / raw)
  To: qemu-devel; +Cc: blauwirbel, aliguori, lersek

Rats, forgot --subject-prefix="PATCH v2".  My apologies!

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] Fix JSON string formatter
  2013-03-23 14:44 ` Blue Swirl
@ 2013-04-11 16:12   ` Markus Armbruster
  0 siblings, 0 replies; 13+ messages in thread
From: Markus Armbruster @ 2013-04-11 16:12 UTC (permalink / raw)
  To: Blue Swirl; +Cc: aliguori, qemu-devel

Blue Swirl <blauwirbel@gmail.com> writes:

> On Thu, Mar 14, 2013 at 5:49 PM, Markus Armbruster <armbru@redhat.com> wrote:
>> This should unbreak "make check" on machines where char is unsigned.
>> Blue, please give it a whirl.
>
> Patches no longer apply, please rebase.

Sent.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] Fix JSON string formatter
  2013-04-11 16:07 [Qemu-devel] [PATCH 0/4] Fix JSON string formatter Markus Armbruster
                   ` (4 preceding siblings ...)
  2013-04-11 16:11 ` [Qemu-devel] [PATCH 0/4] Fix JSON string formatter Markus Armbruster
@ 2013-04-11 17:03 ` Laszlo Ersek
  2013-04-13 19:54 ` Blue Swirl
  6 siblings, 0 replies; 13+ messages in thread
From: Laszlo Ersek @ 2013-04-11 17:03 UTC (permalink / raw)
  To: Markus Armbruster; +Cc: blauwirbel, aliguori, qemu-devel

On 04/11/13 18:07, Markus Armbruster wrote:
> This should unbreak "make check" on machines where char is unsigned.
> Blue, please give it a whirl.
> 
> The JSON parser is still as broken as ever.  Left for another day.
> 
> v2:
> - Rebased, trivial conflicts in PATCH 1/4.
> - Make mod_utf8_codepoint() treat empty input as invalid sequence of
>   length zero (both when n==0 and when n>0 && *s==0).  No code in this
>   series passes empty input.
> - Some commit messages and comments improved.
> 
> Markus Armbruster (4):
>   unicode: New mod_utf8_codepoint()
>   check-qjson: Improve a few comments, delete bogus ones
>   check-qjson: Test noncharacters other than U+FFFE, U+FFFF in strings
>   qjson: to_json() case QTYPE_QSTRING is buggy, rewrite
> 
>  include/qemu-common.h |   3 +
>  qobject/qjson.c       | 102 ++++++++---------
>  tests/check-qjson.c   | 308 ++++++++++++++++++++++++++++++--------------------
>  util/Makefile.objs    |   2 +-
>  util/unicode.c        | 100 ++++++++++++++++
>  5 files changed, 333 insertions(+), 182 deletions(-)
>  create mode 100644 util/unicode.c
> 

I compared this v2 series patch-wise to v1.

Reviewed-by: Laszlo Ersek <lersek@redhat.com>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] Fix JSON string formatter
  2013-04-11 16:07 [Qemu-devel] [PATCH 0/4] Fix JSON string formatter Markus Armbruster
                   ` (5 preceding siblings ...)
  2013-04-11 17:03 ` Laszlo Ersek
@ 2013-04-13 19:54 ` Blue Swirl
  6 siblings, 0 replies; 13+ messages in thread
From: Blue Swirl @ 2013-04-13 19:54 UTC (permalink / raw)
  To: Markus Armbruster; +Cc: Anthony Liguori, Laszlo Ersek, qemu-devel

Thanks, applied all.

On Thu, Apr 11, 2013 at 4:07 PM, Markus Armbruster <armbru@redhat.com> wrote:
> This should unbreak "make check" on machines where char is unsigned.
> Blue, please give it a whirl.
>
> The JSON parser is still as broken as ever.  Left for another day.
>
> v2:
> - Rebased, trivial conflicts in PATCH 1/4.
> - Make mod_utf8_codepoint() treat empty input as invalid sequence of
>   length zero (both when n==0 and when n>0 && *s==0).  No code in this
>   series passes empty input.
> - Some commit messages and comments improved.
>
> Markus Armbruster (4):
>   unicode: New mod_utf8_codepoint()
>   check-qjson: Improve a few comments, delete bogus ones
>   check-qjson: Test noncharacters other than U+FFFE, U+FFFF in strings
>   qjson: to_json() case QTYPE_QSTRING is buggy, rewrite
>
>  include/qemu-common.h |   3 +
>  qobject/qjson.c       | 102 ++++++++---------
>  tests/check-qjson.c   | 308 ++++++++++++++++++++++++++++++--------------------
>  util/Makefile.objs    |   2 +-
>  util/unicode.c        | 100 ++++++++++++++++
>  5 files changed, 333 insertions(+), 182 deletions(-)
>  create mode 100644 util/unicode.c
>
> --
> 1.7.11.7
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2013-04-13 19:54 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-04-11 16:07 [Qemu-devel] [PATCH 0/4] Fix JSON string formatter Markus Armbruster
2013-04-11 16:07 ` [Qemu-devel] [PATCH 1/4] unicode: New mod_utf8_codepoint() Markus Armbruster
2013-04-11 16:07 ` [Qemu-devel] [PATCH 2/4] check-qjson: Improve a few comments, delete bogus ones Markus Armbruster
2013-04-11 16:07 ` [Qemu-devel] [PATCH 3/4] check-qjson: Test noncharacters other than U+FFFE, U+FFFF in strings Markus Armbruster
2013-04-11 16:07 ` [Qemu-devel] [PATCH 4/4] qjson: to_json() case QTYPE_QSTRING is buggy, rewrite Markus Armbruster
2013-04-11 16:11 ` [Qemu-devel] [PATCH 0/4] Fix JSON string formatter Markus Armbruster
2013-04-11 17:03 ` Laszlo Ersek
2013-04-13 19:54 ` Blue Swirl
  -- strict thread matches above, loose matches on Subject: below --
2013-03-14 17:49 Markus Armbruster
2013-03-17 19:55 ` Blue Swirl
2013-03-18  9:58   ` Markus Armbruster
2013-03-23 14:44 ` Blue Swirl
2013-04-11 16:12   ` Markus Armbruster

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).