[Qemu-devel] [PATCH v2 for-2.5? 0/4] qjson: save a lot of memory

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

* [Qemu-devel] [PATCH v2 for-2.5? 0/4] qjson: save a lot of memory
@ 2015-11-23 17:44 Paolo Bonzini
  2015-11-23 17:44 ` [Qemu-devel] [PATCH v2 1/4] qjson: replace QString in JSONLexer with GString Paolo Bonzini
                   ` (5 more replies)
  0 siblings, 6 replies; 25+ messages in thread
From: Paolo Bonzini @ 2015-11-23 17:44 UTC (permalink / raw)
  To: qemu-devel; +Cc: armbru

This patch from 2011 (!) saves about 96% of the allocation cost (down
from 500 MiB to 20 MiB) for check-qjson.

Paolo

Paolo Bonzini (4):
  qjson: replace QString in JSONLexer with GString
  qjson: do not save/restore contexts
  qjson: store tokens in a GQueue
  qjson: surprise, allocating 6 QObjects per token is expensive

 include/qapi/qmp/json-lexer.h    |   7 +-
 include/qapi/qmp/json-parser.h   |   4 +-
 include/qapi/qmp/json-streamer.h |  16 +++-
 monitor.c                        |   2 +-
 qga/main.c                       |   2 +-
 qobject/json-lexer.c             |  22 ++---
 qobject/json-parser.c            | 189 +++++++++++++++------------------------
 qobject/json-streamer.c          |  48 +++++-----
 qobject/qjson.c                  |   2 +-
 tests/libqtest.c                 |   2 +-
 10 files changed, 125 insertions(+), 169 deletions(-)

-- 
2.5.0

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Qemu-devel] [PATCH v2 1/4] qjson: replace QString in JSONLexer with GString
  2015-11-23 17:44 [Qemu-devel] [PATCH v2 for-2.5? 0/4] qjson: save a lot of memory Paolo Bonzini
@ 2015-11-23 17:44 ` Paolo Bonzini
  2015-11-25 12:48   ` Markus Armbruster
  2015-11-23 17:44 ` [Qemu-devel] [PATCH v2 2/4] qjson: do not save/restore contexts Paolo Bonzini
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 25+ messages in thread
From: Paolo Bonzini @ 2015-11-23 17:44 UTC (permalink / raw)
  To: qemu-devel; +Cc: armbru

JSONLexer only needs a simple resizable buffer.  json-streamer.c
can allocate memory for each token instead of relying on reference
counting of QStrings.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 include/qapi/qmp/json-lexer.h    |  7 +++----
 include/qapi/qmp/json-streamer.h |  1 +
 qobject/json-lexer.c             | 22 ++++++++--------------
 qobject/json-streamer.c          | 10 +++++-----
 4 files changed, 17 insertions(+), 23 deletions(-)

diff --git a/include/qapi/qmp/json-lexer.h b/include/qapi/qmp/json-lexer.h
index cdff046..584b27d 100644
--- a/include/qapi/qmp/json-lexer.h
+++ b/include/qapi/qmp/json-lexer.h
@@ -14,8 +14,7 @@
 #ifndef QEMU_JSON_LEXER_H
 #define QEMU_JSON_LEXER_H
 
-#include "qapi/qmp/qstring.h"
-#include "qapi/qmp/qlist.h"
+#include "glib-compat.h"
 
 typedef enum json_token_type {
     JSON_OPERATOR = 100,
@@ -30,13 +29,13 @@ typedef enum json_token_type {
 
 typedef struct JSONLexer JSONLexer;
 
-typedef void (JSONLexerEmitter)(JSONLexer *, QString *, JSONTokenType, int x, int y);
+typedef void (JSONLexerEmitter)(JSONLexer *, GString *, JSONTokenType, int x, int y);
 
 struct JSONLexer
 {
     JSONLexerEmitter *emit;
     int state;
-    QString *token;
+    GString *token;
     int x, y;
 };
 
diff --git a/include/qapi/qmp/json-streamer.h b/include/qapi/qmp/json-streamer.h
index 823f7d7..e901144 100644
--- a/include/qapi/qmp/json-streamer.h
+++ b/include/qapi/qmp/json-streamer.h
@@ -14,6 +14,7 @@
 #ifndef QEMU_JSON_STREAMER_H
 #define QEMU_JSON_STREAMER_H
 
+#include <stdint.h>
 #include "qapi/qmp/qlist.h"
 #include "qapi/qmp/json-lexer.h"
 
diff --git a/qobject/json-lexer.c b/qobject/json-lexer.c
index b19623e..c2060f8 100644
--- a/qobject/json-lexer.c
+++ b/qobject/json-lexer.c
@@ -11,12 +11,9 @@
  *
  */
 
-#include "qapi/qmp/qstring.h"
-#include "qapi/qmp/qlist.h"
-#include "qapi/qmp/qdict.h"
-#include "qapi/qmp/qint.h"
 #include "qemu-common.h"
 #include "qapi/qmp/json-lexer.h"
+#include <stdint.h>
 
 #define MAX_TOKEN_SIZE (64ULL << 20)
 
@@ -272,7 +269,7 @@ void json_lexer_init(JSONLexer *lexer, JSONLexerEmitter func)
 {
     lexer->emit = func;
     lexer->state = IN_START;
-    lexer->token = qstring_new();
+    lexer->token = g_string_sized_new(3);
     lexer->x = lexer->y = 0;
 }
 
@@ -290,7 +287,7 @@ static int json_lexer_feed_char(JSONLexer *lexer, char ch, bool flush)
         new_state = json_lexer[lexer->state][(uint8_t)ch];
         char_consumed = !TERMINAL_NEEDED_LOOKAHEAD(lexer->state, new_state);
         if (char_consumed) {
-            qstring_append_chr(lexer->token, ch);
+            g_string_append_c(lexer->token, ch);
         }
 
         switch (new_state) {
@@ -303,8 +300,7 @@ static int json_lexer_feed_char(JSONLexer *lexer, char ch, bool flush)
             lexer->emit(lexer, lexer->token, new_state, lexer->x, lexer->y);
             /* fall through */
         case JSON_SKIP:
-            QDECREF(lexer->token);
-            lexer->token = qstring_new();
+            g_string_truncate(lexer->token, 0);
             new_state = IN_START;
             break;
         case IN_ERROR:
@@ -322,8 +318,7 @@ static int json_lexer_feed_char(JSONLexer *lexer, char ch, bool flush)
              * induce an error/flush state.
              */
             lexer->emit(lexer, lexer->token, JSON_ERROR, lexer->x, lexer->y);
-            QDECREF(lexer->token);
-            lexer->token = qstring_new();
+            g_string_truncate(lexer->token, 0);
             new_state = IN_START;
             lexer->state = new_state;
             return 0;
@@ -336,10 +331,9 @@ static int json_lexer_feed_char(JSONLexer *lexer, char ch, bool flush)
     /* Do not let a single token grow to an arbitrarily large size,
      * this is a security consideration.
      */
-    if (lexer->token->length > MAX_TOKEN_SIZE) {
+    if (lexer->token->len > MAX_TOKEN_SIZE) {
         lexer->emit(lexer, lexer->token, lexer->state, lexer->x, lexer->y);
-        QDECREF(lexer->token);
-        lexer->token = qstring_new();
+        g_string_truncate(lexer->token, 0);
         lexer->state = IN_START;
     }
 
@@ -369,5 +363,5 @@ int json_lexer_flush(JSONLexer *lexer)
 
 void json_lexer_destroy(JSONLexer *lexer)
 {
-    QDECREF(lexer->token);
+    g_string_free(lexer->token, true);
 }
diff --git a/qobject/json-streamer.c b/qobject/json-streamer.c
index 1b2f9b1..f240501 100644
--- a/qobject/json-streamer.c
+++ b/qobject/json-streamer.c
@@ -12,6 +12,7 @@
  */
 
 #include "qapi/qmp/qlist.h"
+#include "qapi/qmp/qstring.h"
 #include "qapi/qmp/qint.h"
 #include "qapi/qmp/qdict.h"
 #include "qemu-common.h"
@@ -21,13 +22,13 @@
 #define MAX_TOKEN_SIZE (64ULL << 20)
 #define MAX_NESTING (1ULL << 10)
 
-static void json_message_process_token(JSONLexer *lexer, QString *token, JSONTokenType type, int x, int y)
+static void json_message_process_token(JSONLexer *lexer, GString *input, JSONTokenType type, int x, int y)
 {
     JSONMessageParser *parser = container_of(lexer, JSONMessageParser, lexer);
     QDict *dict;
 
     if (type == JSON_OPERATOR) {
-        switch (qstring_get_str(token)[0]) {
+        switch (input->str[0]) {
         case '{':
             parser->brace_count++;
             break;
@@ -47,12 +48,11 @@ static void json_message_process_token(JSONLexer *lexer, QString *token, JSONTok
 
     dict = qdict_new();
     qdict_put(dict, "type", qint_from_int(type));
-    QINCREF(token);
-    qdict_put(dict, "token", token);
+    qdict_put(dict, "token", qstring_from_str(input->str));
     qdict_put(dict, "x", qint_from_int(x));
     qdict_put(dict, "y", qint_from_int(y));
 
-    parser->token_size += token->length;
+    parser->token_size += input->len;
 
     qlist_append(parser->tokens, dict);
 
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Qemu-devel] [PATCH v2 2/4] qjson: do not save/restore contexts
  2015-11-23 17:44 [Qemu-devel] [PATCH v2 for-2.5? 0/4] qjson: save a lot of memory Paolo Bonzini
  2015-11-23 17:44 ` [Qemu-devel] [PATCH v2 1/4] qjson: replace QString in JSONLexer with GString Paolo Bonzini
@ 2015-11-23 17:44 ` Paolo Bonzini
  2015-11-23 17:59   ` Laszlo Ersek
  2015-11-25 14:32   ` Markus Armbruster
  2015-11-23 17:44 ` [Qemu-devel] [PATCH v2 3/4] qjson: store tokens in a GQueue Paolo Bonzini
                   ` (3 subsequent siblings)
  5 siblings, 2 replies; 25+ messages in thread
From: Paolo Bonzini @ 2015-11-23 17:44 UTC (permalink / raw)
  To: qemu-devel; +Cc: armbru

JSON is LL(1) and our parser indeed needs only 1 token lookahead.
Saving the parser context is mostly unnecessary; we can replace it
with peeking at the next token, or remove it altogether when the
restore only happens on errors.  The token list is destroyed anyway
on errors.

The only interesting thing is that parse_keyword always eats
a TOKEN_KEYWORD, even if it is invalid, so it must come last in
parse_value (otherwise, NULL is returned, parse_literal is invoked
and it tries to peek beyond end of input).  This is caught by
/errors/unterminated/literal, which actually checks for an unterminated
keyword. ಠ_ಠ

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 qobject/json-parser.c | 59 ++++++++++++++++++---------------------------------
 1 file changed, 21 insertions(+), 38 deletions(-)

diff --git a/qobject/json-parser.c b/qobject/json-parser.c
index ac991ba..7a287ea 100644
--- a/qobject/json-parser.c
+++ b/qobject/json-parser.c
@@ -296,23 +296,6 @@ static QObject *parser_context_peek_token(JSONParserContext *ctxt)
     return token;
 }
 
-static JSONParserContext parser_context_save(JSONParserContext *ctxt)
-{
-    JSONParserContext saved_ctxt = {0};
-    saved_ctxt.tokens.pos = ctxt->tokens.pos;
-    saved_ctxt.tokens.count = ctxt->tokens.count;
-    saved_ctxt.tokens.buf = ctxt->tokens.buf;
-    return saved_ctxt;
-}
-
-static void parser_context_restore(JSONParserContext *ctxt,
-                                   JSONParserContext saved_ctxt)
-{
-    ctxt->tokens.pos = saved_ctxt.tokens.pos;
-    ctxt->tokens.count = saved_ctxt.tokens.count;
-    ctxt->tokens.buf = saved_ctxt.tokens.buf;
-}
-
 static void tokens_append_from_iter(QObject *obj, void *opaque)
 {
     JSONParserContext *ctxt = opaque;
@@ -364,7 +347,6 @@ static void parser_context_free(JSONParserContext *ctxt)
 static int parse_pair(JSONParserContext *ctxt, QDict *dict, va_list *ap)
 {
     QObject *key = NULL, *token = NULL, *value, *peek;
-    JSONParserContext saved_ctxt = parser_context_save(ctxt);
 
     peek = parser_context_peek_token(ctxt);
     if (peek == NULL) {
@@ -402,7 +384,6 @@ static int parse_pair(JSONParserContext *ctxt, QDict *dict, va_list *ap)
     return 0;
 
 out:
-    parser_context_restore(ctxt, saved_ctxt);
     qobject_decref(key);
 
     return -1;
@@ -412,9 +393,8 @@ static QObject *parse_object(JSONParserContext *ctxt, va_list *ap)
 {
     QDict *dict = NULL;
     QObject *token, *peek;
-    JSONParserContext saved_ctxt = parser_context_save(ctxt);
 
-    token = parser_context_pop_token(ctxt);
+    token = parser_context_peek_token(ctxt);
     if (token == NULL) {
         goto out;
     }
@@ -425,6 +405,7 @@ static QObject *parse_object(JSONParserContext *ctxt, va_list *ap)
 
     dict = qdict_new();
 
+    parser_context_pop_token(ctxt);
     peek = parser_context_peek_token(ctxt);
     if (peek == NULL) {
         parse_error(ctxt, NULL, "premature EOI");
@@ -465,7 +446,6 @@ static QObject *parse_object(JSONParserContext *ctxt, va_list *ap)
     return QOBJECT(dict);
 
 out:
-    parser_context_restore(ctxt, saved_ctxt);
     QDECREF(dict);
     return NULL;
 }
@@ -474,9 +454,8 @@ static QObject *parse_array(JSONParserContext *ctxt, va_list *ap)
 {
     QList *list = NULL;
     QObject *token, *peek;
-    JSONParserContext saved_ctxt = parser_context_save(ctxt);
 
-    token = parser_context_pop_token(ctxt);
+    token = parser_context_peek_token(ctxt);
     if (token == NULL) {
         goto out;
     }
@@ -487,6 +466,7 @@ static QObject *parse_array(JSONParserContext *ctxt, va_list *ap)
 
     list = qlist_new();
 
+    parser_context_pop_token(ctxt);
     peek = parser_context_peek_token(ctxt);
     if (peek == NULL) {
         parse_error(ctxt, NULL, "premature EOI");
@@ -537,7 +517,6 @@ static QObject *parse_array(JSONParserContext *ctxt, va_list *ap)
     return QOBJECT(list);
 
 out:
-    parser_context_restore(ctxt, saved_ctxt);
     QDECREF(list);
     return NULL;
 }
@@ -545,9 +524,8 @@ out:
 static QObject *parse_keyword(JSONParserContext *ctxt)
 {
     QObject *token, *ret;
-    JSONParserContext saved_ctxt = parser_context_save(ctxt);
 
-    token = parser_context_pop_token(ctxt);
+    token = parser_context_peek_token(ctxt);
     if (token == NULL) {
         goto out;
     }
@@ -556,6 +534,7 @@ static QObject *parse_keyword(JSONParserContext *ctxt)
         goto out;
     }
 
+    parser_context_pop_token(ctxt);
     if (token_is_keyword(token, "true")) {
         ret = QOBJECT(qbool_from_bool(true));
     } else if (token_is_keyword(token, "false")) {
@@ -570,7 +549,6 @@ static QObject *parse_keyword(JSONParserContext *ctxt)
     return ret;
 
 out: 
-    parser_context_restore(ctxt, saved_ctxt);
 
     return NULL;
 }
@@ -578,17 +556,21 @@ out:
 static QObject *parse_escape(JSONParserContext *ctxt, va_list *ap)
 {
     QObject *token = NULL, *obj;
-    JSONParserContext saved_ctxt = parser_context_save(ctxt);
 
     if (ap == NULL) {
         goto out;
     }
 
-    token = parser_context_pop_token(ctxt);
+    token = parser_context_peek_token(ctxt);
     if (token == NULL) {
         goto out;
     }
 
+    if (token_get_type(token) != JSON_ESCAPE) {
+        goto out;
+    }
+
+    parser_context_pop_token(ctxt);
     if (token_is_escape(token, "%p")) {
         obj = va_arg(*ap, QObject *);
     } else if (token_is_escape(token, "%i")) {
@@ -611,7 +593,6 @@ static QObject *parse_escape(JSONParserContext *ctxt, va_list *ap)
     return obj;
 
 out:
-    parser_context_restore(ctxt, saved_ctxt);
 
     return NULL;
 }
@@ -619,15 +600,15 @@ out:
 static QObject *parse_literal(JSONParserContext *ctxt)
 {
     QObject *token, *obj;
-    JSONParserContext saved_ctxt = parser_context_save(ctxt);
 
-    token = parser_context_pop_token(ctxt);
+    token = parser_context_peek_token(ctxt);
     if (token == NULL) {
         goto out;
     }
 
     switch (token_get_type(token)) {
     case JSON_STRING:
+        parser_context_pop_token(ctxt);
         obj = QOBJECT(qstring_from_escaped_str(ctxt, token));
         break;
     case JSON_INTEGER: {
@@ -645,15 +626,18 @@ static QObject *parse_literal(JSONParserContext *ctxt)
          */
         int64_t value;
 
+        parser_context_pop_token(ctxt);
         errno = 0; /* strtoll doesn't set errno on success */
         value = strtoll(token_get_value(token), NULL, 10);
         if (errno != ERANGE) {
             obj = QOBJECT(qint_from_int(value));
             break;
         }
-        /* fall through to JSON_FLOAT */
+        goto parse_float;
     }
     case JSON_FLOAT:
+        parser_context_pop_token(ctxt);
+    parse_float:
         /* FIXME dependent on locale */
         obj = QOBJECT(qfloat_from_double(strtod(token_get_value(token), NULL)));
         break;
@@ -664,7 +648,6 @@ static QObject *parse_literal(JSONParserContext *ctxt)
     return obj;
 
 out:
-    parser_context_restore(ctxt, saved_ctxt);
 
     return NULL;
 }
@@ -681,11 +664,11 @@ static QObject *parse_value(JSONParserContext *ctxt, va_list *ap)
         obj = parse_escape(ctxt, ap);
     }
     if (obj == NULL) {
-        obj = parse_keyword(ctxt);
-    } 
-    if (obj == NULL) {
         obj = parse_literal(ctxt);
     }
+    if (obj == NULL) {
+        obj = parse_keyword(ctxt);
+    }
 
     return obj;
 }
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Qemu-devel] [PATCH v2 3/4] qjson: store tokens in a GQueue
  2015-11-23 17:44 [Qemu-devel] [PATCH v2 for-2.5? 0/4] qjson: save a lot of memory Paolo Bonzini
  2015-11-23 17:44 ` [Qemu-devel] [PATCH v2 1/4] qjson: replace QString in JSONLexer with GString Paolo Bonzini
  2015-11-23 17:44 ` [Qemu-devel] [PATCH v2 2/4] qjson: do not save/restore contexts Paolo Bonzini
@ 2015-11-23 17:44 ` Paolo Bonzini
  2015-11-23 17:44 ` [Qemu-devel] [PATCH v2 4/4] qjson: surprise, allocating 6 QObjects per token is expensive Paolo Bonzini
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 25+ messages in thread
From: Paolo Bonzini @ 2015-11-23 17:44 UTC (permalink / raw)
  To: qemu-devel; +Cc: armbru

Even though we still have the "streamer" concept, the tokens can now
be deleted as they are read.  While doing so convert from QList to
GQueue, since the next step will make tokens not a QObject and we
will have to do the conversion anyway.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 include/qapi/qmp/json-parser.h   |  4 +--
 include/qapi/qmp/json-streamer.h |  8 ++---
 monitor.c                        |  2 +-
 qga/main.c                       |  2 +-
 qobject/json-parser.c            | 65 +++++++++++++---------------------------
 qobject/json-streamer.c          | 25 +++++++++-------
 qobject/qjson.c                  |  2 +-
 tests/libqtest.c                 |  2 +-
 8 files changed, 45 insertions(+), 65 deletions(-)

diff --git a/include/qapi/qmp/json-parser.h b/include/qapi/qmp/json-parser.h
index 44d88f3..fea89f8 100644
--- a/include/qapi/qmp/json-parser.h
+++ b/include/qapi/qmp/json-parser.h
@@ -18,7 +18,7 @@
 #include "qapi/qmp/qlist.h"
 #include "qapi/error.h"
 
-QObject *json_parser_parse(QList *tokens, va_list *ap);
-QObject *json_parser_parse_err(QList *tokens, va_list *ap, Error **errp);
+QObject *json_parser_parse(GQueue *tokens, va_list *ap);
+QObject *json_parser_parse_err(GQueue *tokens, va_list *ap, Error **errp);
 
 #endif
diff --git a/include/qapi/qmp/json-streamer.h b/include/qapi/qmp/json-streamer.h
index e901144..e9f2937 100644
--- a/include/qapi/qmp/json-streamer.h
+++ b/include/qapi/qmp/json-streamer.h
@@ -15,21 +15,21 @@
 #define QEMU_JSON_STREAMER_H
 
 #include <stdint.h>
-#include "qapi/qmp/qlist.h"
+#include "glib-compat.h"
 #include "qapi/qmp/json-lexer.h"
 
 typedef struct JSONMessageParser
 {
-    void (*emit)(struct JSONMessageParser *parser, QList *tokens);
+    void (*emit)(struct JSONMessageParser *parser, GQueue *tokens);
     JSONLexer lexer;
     int brace_count;
     int bracket_count;
-    QList *tokens;
+    GQueue *tokens;
     uint64_t token_size;
 } JSONMessageParser;
 
 void json_message_parser_init(JSONMessageParser *parser,
-                              void (*func)(JSONMessageParser *, QList *));
+                              void (*func)(JSONMessageParser *, GQueue *));
 
 int json_message_parser_feed(JSONMessageParser *parser,
                              const char *buffer, size_t size);
diff --git a/monitor.c b/monitor.c
index e4cf34e..781c6f3 100644
--- a/monitor.c
+++ b/monitor.c
@@ -3849,7 +3849,7 @@ static QDict *qmp_check_input_obj(QObject *input_obj, Error **errp)
     return input_dict;
 }
 
-static void handle_qmp_command(JSONMessageParser *parser, QList *tokens)
+static void handle_qmp_command(JSONMessageParser *parser, GQueue *tokens)
 {
     Error *local_err = NULL;
     QObject *obj, *data;
diff --git a/qga/main.c b/qga/main.c
index d2a0ffc..f83a97d 100644
--- a/qga/main.c
+++ b/qga/main.c
@@ -570,7 +570,7 @@ static void process_command(GAState *s, QDict *req)
 }
 
 /* handle requests/control events coming in over the channel */
-static void process_event(JSONMessageParser *parser, QList *tokens)
+static void process_event(JSONMessageParser *parser, GQueue *tokens)
 {
     GAState *s = container_of(parser, GAState, parser);
     QDict *qdict;
diff --git a/qobject/json-parser.c b/qobject/json-parser.c
index 7a287ea..07d9654 100644
--- a/qobject/json-parser.c
+++ b/qobject/json-parser.c
@@ -26,11 +26,8 @@
 typedef struct JSONParserContext
 {
     Error *err;
-    struct {
-        QObject **buf;
-        size_t pos;
-        size_t count;
-    } tokens;
+    QObject *current;
+    GQueue *buf;
 } JSONParserContext;
 
 #define BUG_ON(cond) assert(!(cond))
@@ -274,56 +271,34 @@ out:
     return NULL;
 }
 
+/* Note: unless the token object returned by parser_context_peek_token
+ * or parser_context_pop_token is explicitly incref'd, it will be
+ * deleted as soon as parser_context_pop_token is called again.
+ */
 static QObject *parser_context_pop_token(JSONParserContext *ctxt)
 {
-    QObject *token;
-    g_assert(ctxt->tokens.pos < ctxt->tokens.count);
-    token = ctxt->tokens.buf[ctxt->tokens.pos];
-    ctxt->tokens.pos++;
-    return token;
+    qobject_decref(ctxt->current);
+    assert(!g_queue_is_empty(ctxt->buf));
+    ctxt->current = g_queue_pop_head(ctxt->buf);
+    return ctxt->current;
 }
 
-/* Note: parser_context_{peek|pop}_token do not increment the
- * token object's refcount. In both cases the references will continue
- * to be tracked and cleaned up in parser_context_free(), so do not
- * attempt to free the token object.
- */
 static QObject *parser_context_peek_token(JSONParserContext *ctxt)
 {
-    QObject *token;
-    g_assert(ctxt->tokens.pos < ctxt->tokens.count);
-    token = ctxt->tokens.buf[ctxt->tokens.pos];
-    return token;
-}
-
-static void tokens_append_from_iter(QObject *obj, void *opaque)
-{
-    JSONParserContext *ctxt = opaque;
-    g_assert(ctxt->tokens.pos < ctxt->tokens.count);
-    ctxt->tokens.buf[ctxt->tokens.pos++] = obj;
-    qobject_incref(obj);
+    assert(!g_queue_is_empty(ctxt->buf));
+    return g_queue_peek_head(ctxt->buf);
 }
 
-static JSONParserContext *parser_context_new(QList *tokens)
+static JSONParserContext *parser_context_new(GQueue *tokens)
 {
     JSONParserContext *ctxt;
-    size_t count;
 
     if (!tokens) {
         return NULL;
     }
 
-    count = qlist_size(tokens);
-    if (count == 0) {
-        return NULL;
-    }
-
     ctxt = g_malloc0(sizeof(JSONParserContext));
-    ctxt->tokens.pos = 0;
-    ctxt->tokens.count = count;
-    ctxt->tokens.buf = g_malloc(count * sizeof(QObject *));
-    qlist_iter(tokens, tokens_append_from_iter, ctxt);
-    ctxt->tokens.pos = 0;
+    ctxt->buf = tokens;
 
     return ctxt;
 }
@@ -331,12 +306,12 @@ static JSONParserContext *parser_context_new(QList *tokens)
 /* to support error propagation, ctxt->err must be freed separately */
 static void parser_context_free(JSONParserContext *ctxt)
 {
-    int i;
     if (ctxt) {
-        for (i = 0; i < ctxt->tokens.count; i++) {
-            qobject_decref(ctxt->tokens.buf[i]);
+        while (!g_queue_is_empty(ctxt->buf)) {
+            parser_context_pop_token(ctxt);
         }
-        g_free(ctxt->tokens.buf);
+        qobject_decref(ctxt->current);
+        g_queue_free(ctxt->buf);
         g_free(ctxt);
     }
 }
@@ -673,12 +648,12 @@ static QObject *parse_value(JSONParserContext *ctxt, va_list *ap)
     return obj;
 }
 
-QObject *json_parser_parse(QList *tokens, va_list *ap)
+QObject *json_parser_parse(GQueue *tokens, va_list *ap)
 {
     return json_parser_parse_err(tokens, ap, NULL);
 }
 
-QObject *json_parser_parse_err(QList *tokens, va_list *ap, Error **errp)
+QObject *json_parser_parse_err(GQueue *tokens, va_list *ap, Error **errp)
 {
     JSONParserContext *ctxt = parser_context_new(tokens);
     QObject *result;
diff --git a/qobject/json-streamer.c b/qobject/json-streamer.c
index f240501..d2af38f 100644
--- a/qobject/json-streamer.c
+++ b/qobject/json-streamer.c
@@ -22,6 +22,14 @@
 #define MAX_TOKEN_SIZE (64ULL << 20)
 #define MAX_NESTING (1ULL << 10)
 
+static void json_message_free_tokens(JSONMessageParser *parser)
+{
+    if (parser->tokens) {
+        g_queue_free(parser->tokens);
+        parser->tokens = NULL;
+    }
+}
+
 static void json_message_process_token(JSONLexer *lexer, GString *input, JSONTokenType type, int x, int y)
 {
     JSONMessageParser *parser = container_of(lexer, JSONMessageParser, lexer);
@@ -54,7 +62,7 @@ static void json_message_process_token(JSONLexer *lexer, GString *input, JSONTok
 
     parser->token_size += input->len;
 
-    qlist_append(parser->tokens, dict);
+    g_queue_push_tail(parser->tokens, dict);
 
     if (type == JSON_ERROR) {
         goto out_emit_bad;
@@ -78,27 +86,24 @@ out_emit_bad:
     /* clear out token list and tell the parser to emit and error
      * indication by passing it a NULL list
      */
-    QDECREF(parser->tokens);
-    parser->tokens = NULL;
+    json_message_free_tokens(parser);
 out_emit:
     /* send current list of tokens to parser and reset tokenizer */
     parser->brace_count = 0;
     parser->bracket_count = 0;
+    /* parser->emit takes ownership of parser->tokens.  */
     parser->emit(parser, parser->tokens);
-    if (parser->tokens) {
-        QDECREF(parser->tokens);
-    }
-    parser->tokens = qlist_new();
+    parser->tokens = g_queue_new();
     parser->token_size = 0;
 }
 
 void json_message_parser_init(JSONMessageParser *parser,
-                              void (*func)(JSONMessageParser *, QList *))
+                              void (*func)(JSONMessageParser *, GQueue *))
 {
     parser->emit = func;
     parser->brace_count = 0;
     parser->bracket_count = 0;
-    parser->tokens = qlist_new();
+    parser->tokens = g_queue_new();
     parser->token_size = 0;
 
     json_lexer_init(&parser->lexer, json_message_process_token);
@@ -118,5 +123,5 @@ int json_message_parser_flush(JSONMessageParser *parser)
 void json_message_parser_destroy(JSONMessageParser *parser)
 {
     json_lexer_destroy(&parser->lexer);
-    QDECREF(parser->tokens);
+    json_message_free_tokens(parser);
 }
diff --git a/qobject/qjson.c b/qobject/qjson.c
index 33f8ef5..a3e6a7c 100644
--- a/qobject/qjson.c
+++ b/qobject/qjson.c
@@ -28,7 +28,7 @@ typedef struct JSONParsingState
     QObject *result;
 } JSONParsingState;
 
-static void parse_json(JSONMessageParser *parser, QList *tokens)
+static void parse_json(JSONMessageParser *parser, GQueue *tokens)
 {
     JSONParsingState *s = container_of(parser, JSONParsingState, parser);
     s->result = json_parser_parse(tokens, s->ap);
diff --git a/tests/libqtest.c b/tests/libqtest.c
index f6f3d7a..9753161 100644
--- a/tests/libqtest.c
+++ b/tests/libqtest.c
@@ -351,7 +351,7 @@ typedef struct {
     QDict *response;
 } QMPResponseParser;
 
-static void qmp_response(JSONMessageParser *parser, QList *tokens)
+static void qmp_response(JSONMessageParser *parser, GQueue *tokens)
 {
     QMPResponseParser *qmp = container_of(parser, QMPResponseParser, parser);
     QObject *obj;
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Qemu-devel] [PATCH v2 4/4] qjson: surprise, allocating 6 QObjects per token is expensive
  2015-11-23 17:44 [Qemu-devel] [PATCH v2 for-2.5? 0/4] qjson: save a lot of memory Paolo Bonzini
                   ` (2 preceding siblings ...)
  2015-11-23 17:44 ` [Qemu-devel] [PATCH v2 3/4] qjson: store tokens in a GQueue Paolo Bonzini
@ 2015-11-23 17:44 ` Paolo Bonzini
  2015-11-23 21:00 ` [Qemu-devel] [PATCH v2 for-2.5? 0/4] qjson: save a lot of memory Eric Blake
  2015-11-25 14:47 ` Markus Armbruster
  5 siblings, 0 replies; 25+ messages in thread
From: Paolo Bonzini @ 2015-11-23 17:44 UTC (permalink / raw)
  To: qemu-devel; +Cc: armbru

Replace the contents of the tokens GQueue with a simple struct.  This cuts
the amount of memory allocated by tests/check-qjson from ~500MB to ~20MB,
and the execution time from 600ms to 80ms on my laptop.  Still a lot (some
could be saved by using an intrusive list, such as QSIMPLEQ, instead of
the GQueue), but the savings are already massive and the right thing to
do would probably be to get rid of json-streamer completely.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
        v1->v2: fix memcpy off-by-one [Eric]

 include/qapi/qmp/json-streamer.h |  7 ++++
 qobject/json-parser.c            | 81 +++++++++++++++++++---------------------
 qobject/json-streamer.c          | 19 ++++------
 3 files changed, 53 insertions(+), 54 deletions(-)

diff --git a/include/qapi/qmp/json-streamer.h b/include/qapi/qmp/json-streamer.h
index e9f2937..09b3d3e 100644
--- a/include/qapi/qmp/json-streamer.h
+++ b/include/qapi/qmp/json-streamer.h
@@ -18,6 +18,13 @@
 #include "glib-compat.h"
 #include "qapi/qmp/json-lexer.h"
 
+typedef struct JSONToken {
+    int type;
+    int x;
+    int y;
+    char str[];
+} JSONToken;
+
 typedef struct JSONMessageParser
 {
     void (*emit)(struct JSONMessageParser *parser, GQueue *tokens);
diff --git a/qobject/json-parser.c b/qobject/json-parser.c
index 07d9654..0230bc9 100644
--- a/qobject/json-parser.c
+++ b/qobject/json-parser.c
@@ -22,11 +22,12 @@
 #include "qapi/qmp/qbool.h"
 #include "qapi/qmp/json-parser.h"
 #include "qapi/qmp/json-lexer.h"
+#include "qapi/qmp/json-streamer.h"
 
 typedef struct JSONParserContext
 {
     Error *err;
-    QObject *current;
+    JSONToken *current;
     GQueue *buf;
 } JSONParserContext;
 
@@ -46,56 +47,46 @@ static QObject *parse_value(JSONParserContext *ctxt, va_list *ap);
 /**
  * Token manipulators
  *
- * tokens are dictionaries that contain a type, a string value, and geometry information
+ * tokens contain a type, a string value, and geometry information
  * about a token identified by the lexer.  These are routines that make working with
  * these objects a bit easier.
  */
-static const char *token_get_value(QObject *obj)
-{
-    return qdict_get_str(qobject_to_qdict(obj), "token");
-}
-
-static JSONTokenType token_get_type(QObject *obj)
-{
-    return qdict_get_int(qobject_to_qdict(obj), "type");
-}
-
-static int token_is_operator(QObject *obj, char op)
+static int token_is_operator(JSONToken *obj, char op)
 {
     const char *val;
 
-    if (token_get_type(obj) != JSON_OPERATOR) {
+    if (obj->type != JSON_OPERATOR) {
         return 0;
     }
 
-    val = token_get_value(obj);
+    val = obj->str;
 
     return (val[0] == op) && (val[1] == 0);
 }
 
-static int token_is_keyword(QObject *obj, const char *value)
+static int token_is_keyword(JSONToken *obj, const char *value)
 {
-    if (token_get_type(obj) != JSON_KEYWORD) {
+    if (obj->type != JSON_KEYWORD) {
         return 0;
     }
 
-    return strcmp(token_get_value(obj), value) == 0;
+    return strcmp(obj->str, value) == 0;
 }
 
-static int token_is_escape(QObject *obj, const char *value)
+static int token_is_escape(JSONToken *obj, const char *value)
 {
-    if (token_get_type(obj) != JSON_ESCAPE) {
+    if (obj->type != JSON_ESCAPE) {
         return 0;
     }
 
-    return (strcmp(token_get_value(obj), value) == 0);
+    return (strcmp(obj->str, value) == 0);
 }
 
 /**
  * Error handler
  */
 static void GCC_FMT_ATTR(3, 4) parse_error(JSONParserContext *ctxt,
-                                           QObject *token, const char *msg, ...)
+                                           JSONToken *token, const char *msg, ...)
 {
     va_list ap;
     char message[1024];
@@ -173,9 +164,9 @@ static int hex2decimal(char ch)
  *      \t
  *      \u four-hex-digits 
  */
-static QString *qstring_from_escaped_str(JSONParserContext *ctxt, QObject *token)
+static QString *qstring_from_escaped_str(JSONParserContext *ctxt, JSONToken *token)
 {
-    const char *ptr = token_get_value(token);
+    const char *ptr = token->str;
     QString *str;
     int double_quote = 1;
 
@@ -271,19 +262,19 @@ out:
     return NULL;
 }
 
-/* Note: unless the token object returned by parser_context_peek_token
- * or parser_context_pop_token is explicitly incref'd, it will be
- * deleted as soon as parser_context_pop_token is called again.
+/* Note: the token object returned by parser_context_peek_token or
+ * parser_context_pop_token is deleted as soon as parser_context_pop_token
+ * is called again.
  */
-static QObject *parser_context_pop_token(JSONParserContext *ctxt)
+static JSONToken *parser_context_pop_token(JSONParserContext *ctxt)
 {
-    qobject_decref(ctxt->current);
+    g_free(ctxt->current);
     assert(!g_queue_is_empty(ctxt->buf));
     ctxt->current = g_queue_pop_head(ctxt->buf);
     return ctxt->current;
 }
 
-static QObject *parser_context_peek_token(JSONParserContext *ctxt)
+static JSONToken *parser_context_peek_token(JSONParserContext *ctxt)
 {
     assert(!g_queue_is_empty(ctxt->buf));
     return g_queue_peek_head(ctxt->buf);
@@ -310,7 +301,7 @@ static void parser_context_free(JSONParserContext *ctxt)
         while (!g_queue_is_empty(ctxt->buf)) {
             parser_context_pop_token(ctxt);
         }
-        qobject_decref(ctxt->current);
+        g_free(ctxt->current);
         g_queue_free(ctxt->buf);
         g_free(ctxt);
     }
@@ -321,7 +312,8 @@ static void parser_context_free(JSONParserContext *ctxt)
  */
 static int parse_pair(JSONParserContext *ctxt, QDict *dict, va_list *ap)
 {
-    QObject *key = NULL, *token = NULL, *value, *peek;
+    QObject *key = NULL, *value;
+    JSONToken *peek, *token;
 
     peek = parser_context_peek_token(ctxt);
     if (peek == NULL) {
@@ -367,7 +359,7 @@ out:
 static QObject *parse_object(JSONParserContext *ctxt, va_list *ap)
 {
     QDict *dict = NULL;
-    QObject *token, *peek;
+    JSONToken *token, *peek;
 
     token = parser_context_peek_token(ctxt);
     if (token == NULL) {
@@ -428,7 +420,7 @@ out:
 static QObject *parse_array(JSONParserContext *ctxt, va_list *ap)
 {
     QList *list = NULL;
-    QObject *token, *peek;
+    JSONToken *token, *peek;
 
     token = parser_context_peek_token(ctxt);
     if (token == NULL) {
@@ -498,14 +490,15 @@ out:
 
 static QObject *parse_keyword(JSONParserContext *ctxt)
 {
-    QObject *token, *ret;
+    JSONToken *token;
+    QObject *ret;
 
     token = parser_context_peek_token(ctxt);
     if (token == NULL) {
         goto out;
     }
 
-    if (token_get_type(token) != JSON_KEYWORD) {
+    if (token->type != JSON_KEYWORD) {
         goto out;
     }
 
@@ -517,7 +510,7 @@ static QObject *parse_keyword(JSONParserContext *ctxt)
     } else if (token_is_keyword(token, "null")) {
         ret = qnull();
     } else {
-        parse_error(ctxt, token, "invalid keyword `%s'", token_get_value(token));
+        parse_error(ctxt, token, "invalid keyword `%s'", token->str);
         goto out;
     }
 
@@ -530,7 +523,8 @@ out:
 
 static QObject *parse_escape(JSONParserContext *ctxt, va_list *ap)
 {
-    QObject *token = NULL, *obj;
+    QObject *obj;
+    JSONToken *token;
 
     if (ap == NULL) {
         goto out;
@@ -541,7 +535,7 @@ static QObject *parse_escape(JSONParserContext *ctxt, va_list *ap)
         goto out;
     }
 
-    if (token_get_type(token) != JSON_ESCAPE) {
+    if (token->type != JSON_ESCAPE) {
         goto out;
     }
 
@@ -574,14 +568,15 @@ out:
 
 static QObject *parse_literal(JSONParserContext *ctxt)
 {
-    QObject *token, *obj;
+    JSONToken *token;
+    QObject *obj;
 
     token = parser_context_peek_token(ctxt);
     if (token == NULL) {
         goto out;
     }
 
-    switch (token_get_type(token)) {
+    switch (token->type) {
     case JSON_STRING:
         parser_context_pop_token(ctxt);
         obj = QOBJECT(qstring_from_escaped_str(ctxt, token));
@@ -603,7 +598,7 @@ static QObject *parse_literal(JSONParserContext *ctxt)
 
         parser_context_pop_token(ctxt);
         errno = 0; /* strtoll doesn't set errno on success */
-        value = strtoll(token_get_value(token), NULL, 10);
+        value = strtoll(token->str, NULL, 10);
         if (errno != ERANGE) {
             obj = QOBJECT(qint_from_int(value));
             break;
@@ -614,7 +609,7 @@ static QObject *parse_literal(JSONParserContext *ctxt)
         parser_context_pop_token(ctxt);
     parse_float:
         /* FIXME dependent on locale */
-        obj = QOBJECT(qfloat_from_double(strtod(token_get_value(token), NULL)));
+        obj = QOBJECT(qfloat_from_double(strtod(token->str, NULL)));
         break;
     default:
         goto out;
diff --git a/qobject/json-streamer.c b/qobject/json-streamer.c
index d2af38f..88b1e41 100644
--- a/qobject/json-streamer.c
+++ b/qobject/json-streamer.c
@@ -11,10 +11,6 @@
  *
  */
 
-#include "qapi/qmp/qlist.h"
-#include "qapi/qmp/qstring.h"
-#include "qapi/qmp/qint.h"
-#include "qapi/qmp/qdict.h"
 #include "qemu-common.h"
 #include "qapi/qmp/json-lexer.h"
 #include "qapi/qmp/json-streamer.h"
@@ -33,7 +29,7 @@ static void json_message_free_tokens(JSONMessageParser *parser)
 static void json_message_process_token(JSONLexer *lexer, GString *input, JSONTokenType type, int x, int y)
 {
     JSONMessageParser *parser = container_of(lexer, JSONMessageParser, lexer);
-    QDict *dict;
+    JSONToken *token;
 
     if (type == JSON_OPERATOR) {
         switch (input->str[0]) {
@@ -54,15 +50,16 @@ static void json_message_process_token(JSONLexer *lexer, GString *input, JSONTok
         }
     }
 
-    dict = qdict_new();
-    qdict_put(dict, "type", qint_from_int(type));
-    qdict_put(dict, "token", qstring_from_str(input->str));
-    qdict_put(dict, "x", qint_from_int(x));
-    qdict_put(dict, "y", qint_from_int(y));
+    token = g_malloc(sizeof(JSONToken) + input->len + 1);
+    token->type = type;
+    memcpy(token->str, input->str, input->len);
+    token->str[input->len] = 0;
+    token->x = x;
+    token->y = y;
 
     parser->token_size += input->len;
 
-    g_queue_push_tail(parser->tokens, dict);
+    g_queue_push_tail(parser->tokens, token);
 
     if (type == JSON_ERROR) {
         goto out_emit_bad;
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH v2 2/4] qjson: do not save/restore contexts
  2015-11-23 17:44 ` [Qemu-devel] [PATCH v2 2/4] qjson: do not save/restore contexts Paolo Bonzini
@ 2015-11-23 17:59   ` Laszlo Ersek
  2015-11-23 18:09     ` Paolo Bonzini
                       ` (2 more replies)
  2015-11-25 14:32   ` Markus Armbruster
  1 sibling, 3 replies; 25+ messages in thread
From: Laszlo Ersek @ 2015-11-23 17:59 UTC (permalink / raw)
  To: Paolo Bonzini, qemu-devel; +Cc: armbru

On 11/23/15 18:44, Paolo Bonzini wrote:
> JSON is LL(1) and our parser indeed needs only 1 token lookahead.
> Saving the parser context is mostly unnecessary; we can replace it
> with peeking at the next token, or remove it altogether when the
> restore only happens on errors.  The token list is destroyed anyway
> on errors.
> 
> The only interesting thing is that parse_keyword always eats
> a TOKEN_KEYWORD, even if it is invalid, so it must come last in
> parse_value (otherwise, NULL is returned, parse_literal is invoked
> and it tries to peek beyond end of input).  This is caught by
> /errors/unterminated/literal, which actually checks for an unterminated
> keyword. ಠ_ಠ

Is it accepted practice to put UTF-8 in commit messages? (Or, actually,
anywhere in patches, except maybe the notes section?)

I'd recommend o_O.

Thanks
Laszlo

> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  qobject/json-parser.c | 59 ++++++++++++++++++---------------------------------
>  1 file changed, 21 insertions(+), 38 deletions(-)
> 
> diff --git a/qobject/json-parser.c b/qobject/json-parser.c
> index ac991ba..7a287ea 100644
> --- a/qobject/json-parser.c
> +++ b/qobject/json-parser.c
> @@ -296,23 +296,6 @@ static QObject *parser_context_peek_token(JSONParserContext *ctxt)
>      return token;
>  }
>  
> -static JSONParserContext parser_context_save(JSONParserContext *ctxt)
> -{
> -    JSONParserContext saved_ctxt = {0};
> -    saved_ctxt.tokens.pos = ctxt->tokens.pos;
> -    saved_ctxt.tokens.count = ctxt->tokens.count;
> -    saved_ctxt.tokens.buf = ctxt->tokens.buf;
> -    return saved_ctxt;
> -}
> -
> -static void parser_context_restore(JSONParserContext *ctxt,
> -                                   JSONParserContext saved_ctxt)
> -{
> -    ctxt->tokens.pos = saved_ctxt.tokens.pos;
> -    ctxt->tokens.count = saved_ctxt.tokens.count;
> -    ctxt->tokens.buf = saved_ctxt.tokens.buf;
> -}
> -
>  static void tokens_append_from_iter(QObject *obj, void *opaque)
>  {
>      JSONParserContext *ctxt = opaque;
> @@ -364,7 +347,6 @@ static void parser_context_free(JSONParserContext *ctxt)
>  static int parse_pair(JSONParserContext *ctxt, QDict *dict, va_list *ap)
>  {
>      QObject *key = NULL, *token = NULL, *value, *peek;
> -    JSONParserContext saved_ctxt = parser_context_save(ctxt);
>  
>      peek = parser_context_peek_token(ctxt);
>      if (peek == NULL) {
> @@ -402,7 +384,6 @@ static int parse_pair(JSONParserContext *ctxt, QDict *dict, va_list *ap)
>      return 0;
>  
>  out:
> -    parser_context_restore(ctxt, saved_ctxt);
>      qobject_decref(key);
>  
>      return -1;
> @@ -412,9 +393,8 @@ static QObject *parse_object(JSONParserContext *ctxt, va_list *ap)
>  {
>      QDict *dict = NULL;
>      QObject *token, *peek;
> -    JSONParserContext saved_ctxt = parser_context_save(ctxt);
>  
> -    token = parser_context_pop_token(ctxt);
> +    token = parser_context_peek_token(ctxt);
>      if (token == NULL) {
>          goto out;
>      }
> @@ -425,6 +405,7 @@ static QObject *parse_object(JSONParserContext *ctxt, va_list *ap)
>  
>      dict = qdict_new();
>  
> +    parser_context_pop_token(ctxt);
>      peek = parser_context_peek_token(ctxt);
>      if (peek == NULL) {
>          parse_error(ctxt, NULL, "premature EOI");
> @@ -465,7 +446,6 @@ static QObject *parse_object(JSONParserContext *ctxt, va_list *ap)
>      return QOBJECT(dict);
>  
>  out:
> -    parser_context_restore(ctxt, saved_ctxt);
>      QDECREF(dict);
>      return NULL;
>  }
> @@ -474,9 +454,8 @@ static QObject *parse_array(JSONParserContext *ctxt, va_list *ap)
>  {
>      QList *list = NULL;
>      QObject *token, *peek;
> -    JSONParserContext saved_ctxt = parser_context_save(ctxt);
>  
> -    token = parser_context_pop_token(ctxt);
> +    token = parser_context_peek_token(ctxt);
>      if (token == NULL) {
>          goto out;
>      }
> @@ -487,6 +466,7 @@ static QObject *parse_array(JSONParserContext *ctxt, va_list *ap)
>  
>      list = qlist_new();
>  
> +    parser_context_pop_token(ctxt);
>      peek = parser_context_peek_token(ctxt);
>      if (peek == NULL) {
>          parse_error(ctxt, NULL, "premature EOI");
> @@ -537,7 +517,6 @@ static QObject *parse_array(JSONParserContext *ctxt, va_list *ap)
>      return QOBJECT(list);
>  
>  out:
> -    parser_context_restore(ctxt, saved_ctxt);
>      QDECREF(list);
>      return NULL;
>  }
> @@ -545,9 +524,8 @@ out:
>  static QObject *parse_keyword(JSONParserContext *ctxt)
>  {
>      QObject *token, *ret;
> -    JSONParserContext saved_ctxt = parser_context_save(ctxt);
>  
> -    token = parser_context_pop_token(ctxt);
> +    token = parser_context_peek_token(ctxt);
>      if (token == NULL) {
>          goto out;
>      }
> @@ -556,6 +534,7 @@ static QObject *parse_keyword(JSONParserContext *ctxt)
>          goto out;
>      }
>  
> +    parser_context_pop_token(ctxt);
>      if (token_is_keyword(token, "true")) {
>          ret = QOBJECT(qbool_from_bool(true));
>      } else if (token_is_keyword(token, "false")) {
> @@ -570,7 +549,6 @@ static QObject *parse_keyword(JSONParserContext *ctxt)
>      return ret;
>  
>  out: 
> -    parser_context_restore(ctxt, saved_ctxt);
>  
>      return NULL;
>  }
> @@ -578,17 +556,21 @@ out:
>  static QObject *parse_escape(JSONParserContext *ctxt, va_list *ap)
>  {
>      QObject *token = NULL, *obj;
> -    JSONParserContext saved_ctxt = parser_context_save(ctxt);
>  
>      if (ap == NULL) {
>          goto out;
>      }
>  
> -    token = parser_context_pop_token(ctxt);
> +    token = parser_context_peek_token(ctxt);
>      if (token == NULL) {
>          goto out;
>      }
>  
> +    if (token_get_type(token) != JSON_ESCAPE) {
> +        goto out;
> +    }
> +
> +    parser_context_pop_token(ctxt);
>      if (token_is_escape(token, "%p")) {
>          obj = va_arg(*ap, QObject *);
>      } else if (token_is_escape(token, "%i")) {
> @@ -611,7 +593,6 @@ static QObject *parse_escape(JSONParserContext *ctxt, va_list *ap)
>      return obj;
>  
>  out:
> -    parser_context_restore(ctxt, saved_ctxt);
>  
>      return NULL;
>  }
> @@ -619,15 +600,15 @@ out:
>  static QObject *parse_literal(JSONParserContext *ctxt)
>  {
>      QObject *token, *obj;
> -    JSONParserContext saved_ctxt = parser_context_save(ctxt);
>  
> -    token = parser_context_pop_token(ctxt);
> +    token = parser_context_peek_token(ctxt);
>      if (token == NULL) {
>          goto out;
>      }
>  
>      switch (token_get_type(token)) {
>      case JSON_STRING:
> +        parser_context_pop_token(ctxt);
>          obj = QOBJECT(qstring_from_escaped_str(ctxt, token));
>          break;
>      case JSON_INTEGER: {
> @@ -645,15 +626,18 @@ static QObject *parse_literal(JSONParserContext *ctxt)
>           */
>          int64_t value;
>  
> +        parser_context_pop_token(ctxt);
>          errno = 0; /* strtoll doesn't set errno on success */
>          value = strtoll(token_get_value(token), NULL, 10);
>          if (errno != ERANGE) {
>              obj = QOBJECT(qint_from_int(value));
>              break;
>          }
> -        /* fall through to JSON_FLOAT */
> +        goto parse_float;
>      }
>      case JSON_FLOAT:
> +        parser_context_pop_token(ctxt);
> +    parse_float:
>          /* FIXME dependent on locale */
>          obj = QOBJECT(qfloat_from_double(strtod(token_get_value(token), NULL)));
>          break;
> @@ -664,7 +648,6 @@ static QObject *parse_literal(JSONParserContext *ctxt)
>      return obj;
>  
>  out:
> -    parser_context_restore(ctxt, saved_ctxt);
>  
>      return NULL;
>  }
> @@ -681,11 +664,11 @@ static QObject *parse_value(JSONParserContext *ctxt, va_list *ap)
>          obj = parse_escape(ctxt, ap);
>      }
>      if (obj == NULL) {
> -        obj = parse_keyword(ctxt);
> -    } 
> -    if (obj == NULL) {
>          obj = parse_literal(ctxt);
>      }
> +    if (obj == NULL) {
> +        obj = parse_keyword(ctxt);
> +    }
>  
>      return obj;
>  }
> 

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH v2 2/4] qjson: do not save/restore contexts
  2015-11-23 17:59   ` Laszlo Ersek
@ 2015-11-23 18:09     ` Paolo Bonzini
  2015-11-23 19:18       ` Laszlo Ersek
  2015-11-23 20:05     ` Eric Blake
  2015-11-24  8:03     ` Gerd Hoffmann
  2 siblings, 1 reply; 25+ messages in thread
From: Paolo Bonzini @ 2015-11-23 18:09 UTC (permalink / raw)
  To: Laszlo Ersek, qemu-devel; +Cc: armbru



On 23/11/2015 18:59, Laszlo Ersek wrote:
>> > The only interesting thing is that parse_keyword always eats
>> > a TOKEN_KEYWORD, even if it is invalid, so it must come last in
>> > parse_value (otherwise, NULL is returned, parse_literal is invoked
>> > and it tries to peek beyond end of input).  This is caught by
>> > /errors/unterminated/literal, which actually checks for an unterminated
>> > keyword. ಠ_ಠ
> Is it accepted practice to put UTF-8 in commit messages? (Or, actually,
> anywhere in patches, except maybe the notes section?)
> 
> I'd recommend o_O.

Perhaps not Kannada, but Unicode 0xA1 to 0x1FF is pretty common in
commit messages.

Paolo

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH v2 2/4] qjson: do not save/restore contexts
  2015-11-23 18:09     ` Paolo Bonzini
@ 2015-11-23 19:18       ` Laszlo Ersek
  0 siblings, 0 replies; 25+ messages in thread
From: Laszlo Ersek @ 2015-11-23 19:18 UTC (permalink / raw)
  To: Paolo Bonzini, qemu-devel; +Cc: armbru

On 11/23/15 19:09, Paolo Bonzini wrote:
> 
> 
> On 23/11/2015 18:59, Laszlo Ersek wrote:
>>>> The only interesting thing is that parse_keyword always eats
>>>> a TOKEN_KEYWORD, even if it is invalid, so it must come last in
>>>> parse_value (otherwise, NULL is returned, parse_literal is invoked
>>>> and it tries to peek beyond end of input).  This is caught by
>>>> /errors/unterminated/literal, which actually checks for an unterminated
>>>> keyword. ಠ_ಠ
>> Is it accepted practice to put UTF-8 in commit messages? (Or, actually,
>> anywhere in patches, except maybe the notes section?)
>>
>> I'd recommend o_O.
> 
> Perhaps not Kannada, but Unicode 0xA1 to 0x1FF is pretty common in
> commit messages.

That seems to be a subset of the union of the following three:

https://en.wikipedia.org/wiki/Latin-1_Supplement_%28Unicode_block%29
https://en.wikipedia.org/wiki/Latin_Extended-A
https://en.wikipedia.org/wiki/Latin_Extended-B

Thanks for the info,
Laszlo

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH v2 2/4] qjson: do not save/restore contexts
  2015-11-23 17:59   ` Laszlo Ersek
  2015-11-23 18:09     ` Paolo Bonzini
@ 2015-11-23 20:05     ` Eric Blake
  2015-11-23 20:26       ` Laszlo Ersek
  2015-11-24  8:03     ` Gerd Hoffmann
  2 siblings, 1 reply; 25+ messages in thread
From: Eric Blake @ 2015-11-23 20:05 UTC (permalink / raw)
  To: Laszlo Ersek, Paolo Bonzini, qemu-devel; +Cc: armbru

[-- Attachment #1: Type: text/plain, Size: 1168 bytes --]

On 11/23/2015 10:59 AM, Laszlo Ersek wrote:
> On 11/23/15 18:44, Paolo Bonzini wrote:
>> JSON is LL(1) and our parser indeed needs only 1 token lookahead.
>> Saving the parser context is mostly unnecessary; we can replace it
>> with peeking at the next token, or remove it altogether when the
>> restore only happens on errors.  The token list is destroyed anyway
>> on errors.
>>
>> The only interesting thing is that parse_keyword always eats
>> a TOKEN_KEYWORD, even if it is invalid, so it must come last in
>> parse_value (otherwise, NULL is returned, parse_literal is invoked
>> and it tries to peek beyond end of input).  This is caught by
>> /errors/unterminated/literal, which actually checks for an unterminated
>> keyword. ಠ_ಠ
> 
> Is it accepted practice to put UTF-8 in commit messages? (Or, actually,
> anywhere in patches, except maybe the notes section?)
> 

Git handles UTF-8 just fine (and for any other encoding, properly
transmitted in the email, git transcodes to UTF-8 before writing it into
the repository).

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH v2 2/4] qjson: do not save/restore contexts
  2015-11-23 20:05     ` Eric Blake
@ 2015-11-23 20:26       ` Laszlo Ersek
  0 siblings, 0 replies; 25+ messages in thread
From: Laszlo Ersek @ 2015-11-23 20:26 UTC (permalink / raw)
  To: Eric Blake, Paolo Bonzini, qemu-devel; +Cc: armbru

On 11/23/15 21:05, Eric Blake wrote:
> On 11/23/2015 10:59 AM, Laszlo Ersek wrote:
>> On 11/23/15 18:44, Paolo Bonzini wrote:
>>> JSON is LL(1) and our parser indeed needs only 1 token lookahead.
>>> Saving the parser context is mostly unnecessary; we can replace it
>>> with peeking at the next token, or remove it altogether when the
>>> restore only happens on errors.  The token list is destroyed anyway
>>> on errors.
>>>
>>> The only interesting thing is that parse_keyword always eats
>>> a TOKEN_KEYWORD, even if it is invalid, so it must come last in
>>> parse_value (otherwise, NULL is returned, parse_literal is invoked
>>> and it tries to peek beyond end of input).  This is caught by
>>> /errors/unterminated/literal, which actually checks for an unterminated
>>> keyword. ಠ_ಠ
>>
>> Is it accepted practice to put UTF-8 in commit messages? (Or, actually,
>> anywhere in patches, except maybe the notes section?)
>>
> 
> Git handles UTF-8 just fine (and for any other encoding, properly
> transmitted in the email, git transcodes to UTF-8 before writing it into
> the repository).
> 

Yes, I know. I use latin2:

$ locale

LANG=
LC_CTYPE=hu_HU.ISO8859-2
LC_NUMERIC="POSIX"
LC_TIME="POSIX"
LC_COLLATE="POSIX"
LC_MONETARY="POSIX"
LC_MESSAGES="POSIX"
LC_PAPER="POSIX"
LC_NAME="POSIX"
LC_ADDRESS="POSIX"
LC_TELEPHONE="POSIX"
LC_MEASUREMENT="POSIX"
LC_IDENTIFICATION="POSIX"
LC_ALL=

and from my git config:

[i18n]
	logOutputEncoding = latin2
	commitencoding = latin2

This works very well -- as long as it doesn't choke on something outside
of latin2 --, both the glibc locale support and git are doing their jobs
perfectly fine; my question concerned any other users who decided to
stay with single-byte encodings (with an ASCII subset).

(I believe that RFCs stick with ASCII to this day, and I also think that
our source code and docs/ should stick with ASCII; but I know I can't
plausibly argue for the same in commit messages, assuming I'm alone with
that anyway.

BTW I should have written "non-ASCII Unicode code points" in my original
question, rather than "UTF-8".)

Thanks!
Laszlo

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH v2 for-2.5? 0/4] qjson: save a lot of memory
  2015-11-23 17:44 [Qemu-devel] [PATCH v2 for-2.5? 0/4] qjson: save a lot of memory Paolo Bonzini
                   ` (3 preceding siblings ...)
  2015-11-23 17:44 ` [Qemu-devel] [PATCH v2 4/4] qjson: surprise, allocating 6 QObjects per token is expensive Paolo Bonzini
@ 2015-11-23 21:00 ` Eric Blake
  2015-11-25 14:47 ` Markus Armbruster
  5 siblings, 0 replies; 25+ messages in thread
From: Eric Blake @ 2015-11-23 21:00 UTC (permalink / raw)
  To: Paolo Bonzini, qemu-devel; +Cc: armbru

[-- Attachment #1: Type: text/plain, Size: 559 bytes --]

On 11/23/2015 10:44 AM, Paolo Bonzini wrote:
> This patch from 2011 (!) saves about 96% of the allocation cost (down
> from 500 MiB to 20 MiB) for check-qjson.
> 
> Paolo
> 
> Paolo Bonzini (4):
>   qjson: replace QString in JSONLexer with GString
>   qjson: do not save/restore contexts
>   qjson: store tokens in a GQueue
>   qjson: surprise, allocating 6 QObjects per token is expensive

Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH v2 2/4] qjson: do not save/restore contexts
  2015-11-23 17:59   ` Laszlo Ersek
  2015-11-23 18:09     ` Paolo Bonzini
  2015-11-23 20:05     ` Eric Blake
@ 2015-11-24  8:03     ` Gerd Hoffmann
  2015-11-24 10:50       ` Laszlo Ersek
  2 siblings, 1 reply; 25+ messages in thread
From: Gerd Hoffmann @ 2015-11-24  8:03 UTC (permalink / raw)
  To: Laszlo Ersek; +Cc: Paolo Bonzini, qemu-devel, armbru

  Hi,

> Is it accepted practice to put UTF-8 in commit messages? (Or, actually,
> anywhere in patches, except maybe the notes section?)

Sure.  We don't want limit people names (in signed-off etc) to us-ascii.

See "eb8934b vnc: fix memory corruption (CVE-2015-5225)" for a name
written in kanji.

> I'd recommend o_O.

Heh, it's 2015, not 1995 ...

cheers,
  Gerd

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH v2 2/4] qjson: do not save/restore contexts
  2015-11-24  8:03     ` Gerd Hoffmann
@ 2015-11-24 10:50       ` Laszlo Ersek
  2015-11-24 11:18         ` Paolo Bonzini
                           ` (2 more replies)
  0 siblings, 3 replies; 25+ messages in thread
From: Laszlo Ersek @ 2015-11-24 10:50 UTC (permalink / raw)
  To: Gerd Hoffmann; +Cc: Paolo Bonzini, qemu-devel, armbru

On 11/24/15 09:03, Gerd Hoffmann wrote:
>   Hi,
> 
>> Is it accepted practice to put UTF-8 in commit messages? (Or, actually,
>> anywhere in patches, except maybe the notes section?)
> 
> Sure.  We don't want limit people names (in signed-off etc) to us-ascii.
> 
> See "eb8934b vnc: fix memory corruption (CVE-2015-5225)" for a name
> written in kanji.

I'm very sorry, but this is something I expressly disagree with. (I
didn't want to bring this up on my own, but since you did...)

International engineering / science / research etc. are being done in
English. We use English because we expect people to learn one common
language (native English speakers have it easy, but that's a side
point), so that everyone not have to learn every possible language.

The same point applies *much more* to writing systems / alphabets. You
(the generic you) can't expect me (the generic me) to read Kanji,
Sanskrit, Thai script, Cyrillic script, and so on, even if your name is
written in that language natively. You come up with an approximation in
Latin script, and use that.

Is your purpose to feel pleased about the faithful representation of
your name in the commit message (that the international community is
unable to read, not even approximately), or is your goal to allow the
community to read your (approximate) name?

I bet everyone who is involved in international development, and travels
occasionally, has business cards in Latin script *too*. I bet whoever
does research and publishes papers in English puts their name (or at
least an official approximation of it) on the front page in  Latin
script *too*.

Specifically about the commit you mention, the email of the reporter is:

  zuozhi.fzz@alibaba-inc.com

I'm absolutely sure that "zuozhi" is the official Pinyin transliteration
of the reporter's name (or a part of it). Now, while Pinyin has its own
separate pronunciation rules, I *can* (and occasionally do) look up
those rules. So Pinyin allows me to *work* with the name with relative
safety, and it even gives me a fleeting chance at getting the
pronunciation right, should we meet.

My name is László Érsek. I've dropped the accents for the purpose of
international exchange in advance; I just write Laszlo Ersek, even when
I sign physical documents that are in English.

Even that way, I've seen the larger community abuse my name endlessly;
in particular I've seen all permutations (= reordering) and variations
(= missing characters) of the substring "szl" in "Laszlo".

If the larger community fails to get such a simple ASCII name right --
and yes I'm at fault too, I have occasionally left off the second "n" of
your last name --, then why am I (or anyone else) expected to struggle
with names written in non-Latin script? They are much harder, and have
exactly zero value, as far as international collaboration is concerned.

The development is being done in English, and the script of English is
Latin. Stick with it.

>> I'd recommend o_O.
> 
> Heh, it's 2015, not 1995 ...

Sure, and the Internet Standards are still being written in pure ASCII.

https://en.wikipedia.org/wiki/Request_for_Comments#Obtaining_RFCs

Laszlo

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH v2 2/4] qjson: do not save/restore contexts
  2015-11-24 10:50       ` Laszlo Ersek
@ 2015-11-24 11:18         ` Paolo Bonzini
  2015-11-24 12:44           ` Fam Zheng
  2015-11-24 11:33         ` Gerd Hoffmann
  2015-11-24 11:39         ` Laszlo Ersek
  2 siblings, 1 reply; 25+ messages in thread
From: Paolo Bonzini @ 2015-11-24 11:18 UTC (permalink / raw)
  To: Laszlo Ersek, Gerd Hoffmann; +Cc: Fam Zheng, qemu-devel, armbru

On 24/11/2015 11:50, Laszlo Ersek wrote:
> You
> (the generic you) can't expect me (the generic me) to read Kanji,
> Sanskrit, Thai script, Cyrillic script, and so on, even if your name is
> written in that language natively. You come up with an approximation in
> Latin script, and use that.
> 
> Is your purpose to feel pleased about the faithful representation of
> your name in the commit message (that the international community is
> unable to read, not even approximately), or is your goal to allow the
> community to read your (approximate) name?
> 
> Specifically about the commit you mention, the email of the reporter is:
> 
>   zuozhi.fzz@alibaba-inc.com
> 
> I'm absolutely sure that "zuozhi" is the official Pinyin transliteration
> of the reporter's name (or a part of it). Now, while Pinyin has its own
> separate pronunciation rules, I *can* (and occasionally do) look up
> those rules. So Pinyin allows me to *work* with the name with relative
> safety, and it even gives me a fleeting chance at getting the
> pronunciation right, should we meet.

I think this is getting into dangerous territory. :)

Chinese language and names are very different from ours and it's
possible that a person for some reason is very attached to the
particular hanzi (kanji is Japanese :)) that are used for their name.
(Fam, correct me if I'm wrong).

I've also seen people who have adopted an alternative English name and
still want to include the Chinese name somewhere, at which point it's
understandable that they write the latter with hanzi.

Certainly it's better if you get something like these:

   Signed-off-by: Gong Li (巩俐) <gong.li@example.com>
   Signed-off-by: 巩俐 (Gong Li) <gong.li@example.com>

but if the email is understandable I have no problem with

   Signed-off-by: 巩俐 <gong.li@example.com>

or in the case of an English name any of

   Signed-off-by: Jane Li (巩俐) <gong.li@example.com>
   Signed-off-by: 巩俐 (Jane Li) <gong.li@example.com>
   Signed-off-by: 巩俐 (Jane Li) <lig@example.com>

where in the last case the email doesn't give the full Chinese name, but
there is an alternative for people who cannot read Chinese characters.

For some reason this almost never happens with Japanese developers, but
https://en.wikipedia.org/wiki/Japanese_name#Difficulty_of_reading_names
suggests that they may also appreciate using kanji for their names in
commit messages.

Thanks,

Paolo

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH v2 2/4] qjson: do not save/restore contexts
  2015-11-24 10:50       ` Laszlo Ersek
  2015-11-24 11:18         ` Paolo Bonzini
@ 2015-11-24 11:33         ` Gerd Hoffmann
  2015-11-24 11:39         ` Laszlo Ersek
  2 siblings, 0 replies; 25+ messages in thread
From: Gerd Hoffmann @ 2015-11-24 11:33 UTC (permalink / raw)
  To: Laszlo Ersek; +Cc: Paolo Bonzini, qemu-devel, armbru

  Hi,

> The same point applies *much more* to writing systems / alphabets. You
> (the generic you) can't expect me (the generic me) to read Kanji,
> Sanskrit, Thai script, Cyrillic script, and so on, even if your name is
> written in that language natively. You come up with an approximation in
> Latin script, and use that.

> Is your purpose to feel pleased about the faithful representation of
> your name in the commit message (that the international community is
> unable to read, not even approximately), or is your goal to allow the
> community to read your (approximate) name?

IMO that is for the people in question to decide.  Usually I just cut
+paste (or let stefans great patches script collect) what they are using
them-self and are apparently comfortable with.

Having Kanji only as in that specific case is unusual indeed, in most
cases there is latin transcript, either in place of or additionally to
the native version, like this: "latin (native) <email>".  The latter
looks a bit silly for latin-family of alphabets, for others
(kanji/cyrillic/...) it makes more sense and seems to be more common.

But even when they use native only I still think their name and not only
the email address should appear in the commit message when giving them
the credit they deserve.

cheers,
  Gerd

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH v2 2/4] qjson: do not save/restore contexts
  2015-11-24 10:50       ` Laszlo Ersek
  2015-11-24 11:18         ` Paolo Bonzini
  2015-11-24 11:33         ` Gerd Hoffmann
@ 2015-11-24 11:39         ` Laszlo Ersek
  2 siblings, 0 replies; 25+ messages in thread
From: Laszlo Ersek @ 2015-11-24 11:39 UTC (permalink / raw)
  To: Gerd Hoffmann; +Cc: Paolo Bonzini, qemu-devel, armbru

On 11/24/15 11:50, Laszlo Ersek wrote:

> My name is László Érsek.

OMG, my native name is actually "Érsek László". 95% of my written
communication is in English, and this very discussion is occurring in
English. So in this context I forgot that in Hungarian the family name
goes first.

(Just goes on to show that I'm willing to mangle my name for
collaboration's sake. Others should be too.)

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH v2 2/4] qjson: do not save/restore contexts
  2015-11-24 11:18         ` Paolo Bonzini
@ 2015-11-24 12:44           ` Fam Zheng
  2015-11-24 12:54             ` Paolo Bonzini
  0 siblings, 1 reply; 25+ messages in thread
From: Fam Zheng @ 2015-11-24 12:44 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: armbru, Laszlo Ersek, Gerd Hoffmann, qemu-devel

On Tue, 11/24 12:18, Paolo Bonzini wrote:
> 
> 
> On 24/11/2015 11:50, Laszlo Ersek wrote:
> > You
> > (the generic you) can't expect me (the generic me) to read Kanji,
> > Sanskrit, Thai script, Cyrillic script, and so on, even if your name is
> > written in that language natively. You come up with an approximation in
> > Latin script, and use that.
> > 
> > Is your purpose to feel pleased about the faithful representation of
> > your name in the commit message (that the international community is
> > unable to read, not even approximately), or is your goal to allow the
> > community to read your (approximate) name?
> > 
> > Specifically about the commit you mention, the email of the reporter is:
> > 
> >   zuozhi.fzz@alibaba-inc.com
> > 
> > I'm absolutely sure that "zuozhi" is the official Pinyin transliteration
> > of the reporter's name (or a part of it). Now, while Pinyin has its own

Yes, it's the Pinyin of 祚至, his given name. It's a peculiar and literary
choice of name. :)

> > separate pronunciation rules, I *can* (and occasionally do) look up
> > those rules. So Pinyin allows me to *work* with the name with relative
> > safety, and it even gives me a fleeting chance at getting the
> > pronunciation right, should we meet.
> 
> I think this is getting into dangerous territory. :)
> 
> Chinese language and names are very different from ours and it's
> possible that a person for some reason is very attached to the
> particular hanzi (kanji is Japanese :)) that are used for their name.
> (Fam, correct me if I'm wrong).
> 
> I've also seen people who have adopted an alternative English name and
> still want to include the Chinese name somewhere, at which point it's
> understandable that they write the latter with hanzi.
> 
> Certainly it's better if you get something like these:
> 
>    Signed-off-by: Gong Li (巩俐) <gong.li@example.com>
>    Signed-off-by: 巩俐 (Gong Li) <gong.li@example.com>
> 
> but if the email is understandable I have no problem with
> 
>    Signed-off-by: 巩俐 <gong.li@example.com>
> 
> or in the case of an English name any of
> 
>    Signed-off-by: Jane Li (巩俐) <gong.li@example.com>
>    Signed-off-by: 巩俐 (Jane Li) <gong.li@example.com>
>    Signed-off-by: 巩俐 (Jane Li) <lig@example.com>
> 
> where in the last case the email doesn't give the full Chinese name, but
> there is an alternative for people who cannot read Chinese characters.
> 
> For some reason this almost never happens with Japanese developers, but
> https://en.wikipedia.org/wiki/Japanese_name#Difficulty_of_reading_names
> suggests that they may also appreciate using kanji for their names in
> commit messages.

The problem for using Pinyin is, even I as a Chinese can't tell which is the
last name out of a two-syllable name such as "Gong Li", because of obvious
reasons.

Having hanzi together with Pinyin helps a lot, in that it tells me how to call
it, should we meet. Not just in "which is the family name", but also which
tone to use. In Chinese, you're not calling one's name if not using the right
tone. For example "zuozhi" can be "左之" or "祚至" or one of other 100
combinations, which vary in tones. Personally I don't feel very comfortable
when a Mandarin speaker calls me with a wrong tone. And I know that will happen
from time to time if I stick to Pinyin rather than Fam.  Thinking from the
other way round, that may be a reason why they use hanzi.

All in all, in my opinion, only something like

     Signed-off-by: 巩俐 <lig@example.com>

, from which you can't easily infer the pronunciation, is apparently
inconsiderate in an English context. Otherwise I think we should tolerate the
usage of non-latin characters even if it means we have to copy&paste.

Fam

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH v2 2/4] qjson: do not save/restore contexts
  2015-11-24 12:44           ` Fam Zheng
@ 2015-11-24 12:54             ` Paolo Bonzini
  2015-11-24 13:15               ` Markus Armbruster
  0 siblings, 1 reply; 25+ messages in thread
From: Paolo Bonzini @ 2015-11-24 12:54 UTC (permalink / raw)
  To: Fam Zheng; +Cc: armbru, Laszlo Ersek, Gerd Hoffmann, qemu-devel



On 24/11/2015 13:44, Fam Zheng wrote:
> All in all, in my opinion, only something like
> 
>      Signed-off-by: 巩俐 <lig@example.com>
> 
> , from which you can't easily infer the pronunciation, is apparently
> inconsiderate in an English context. Otherwise I think we should tolerate the
> usage of non-latin characters even if it means we have to copy&paste.

FWIW, I totally agree.

Paolo

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH v2 2/4] qjson: do not save/restore contexts
  2015-11-24 12:54             ` Paolo Bonzini
@ 2015-11-24 13:15               ` Markus Armbruster
  0 siblings, 0 replies; 25+ messages in thread
From: Markus Armbruster @ 2015-11-24 13:15 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Laszlo Ersek, Fam Zheng, Gerd Hoffmann, qemu-devel

Paolo Bonzini <pbonzini@redhat.com> writes:

> On 24/11/2015 13:44, Fam Zheng wrote:
>> All in all, in my opinion, only something like
>> 
>>      Signed-off-by: 巩俐 <lig@example.com>
>> 
>> , from which you can't easily infer the pronunciation, is apparently
>> inconsiderate in an English context. Otherwise I think we should tolerate the
>> usage of non-latin characters even if it means we have to copy&paste.
>
> FWIW, I totally agree.

Me too.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH v2 1/4] qjson: replace QString in JSONLexer with GString
  2015-11-23 17:44 ` [Qemu-devel] [PATCH v2 1/4] qjson: replace QString in JSONLexer with GString Paolo Bonzini
@ 2015-11-25 12:48   ` Markus Armbruster
  2015-11-25 13:34     ` Paolo Bonzini
  0 siblings, 1 reply; 25+ messages in thread
From: Markus Armbruster @ 2015-11-25 12:48 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel

Paolo Bonzini <pbonzini@redhat.com> writes:

> JSONLexer only needs a simple resizable buffer.  json-streamer.c
> can allocate memory for each token instead of relying on reference
> counting of QStrings.

On its own, this doesn't really save anything, it merely decouples the
lexer's buffer from the rest of the machinery, which still allocates the
same QStrings as before.  But it enables the rest of your work.

> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  include/qapi/qmp/json-lexer.h    |  7 +++----
>  include/qapi/qmp/json-streamer.h |  1 +
>  qobject/json-lexer.c             | 22 ++++++++--------------
>  qobject/json-streamer.c          | 10 +++++-----
>  4 files changed, 17 insertions(+), 23 deletions(-)
>
> diff --git a/include/qapi/qmp/json-lexer.h b/include/qapi/qmp/json-lexer.h
> index cdff046..584b27d 100644
> --- a/include/qapi/qmp/json-lexer.h
> +++ b/include/qapi/qmp/json-lexer.h
> @@ -14,8 +14,7 @@
>  #ifndef QEMU_JSON_LEXER_H
>  #define QEMU_JSON_LEXER_H
>  
> -#include "qapi/qmp/qstring.h"
> -#include "qapi/qmp/qlist.h"
> +#include "glib-compat.h"
>  
>  typedef enum json_token_type {
>      JSON_OPERATOR = 100,
> @@ -30,13 +29,13 @@ typedef enum json_token_type {
>  
>  typedef struct JSONLexer JSONLexer;
>  
> -typedef void (JSONLexerEmitter)(JSONLexer *, QString *, JSONTokenType, int x, int y);
> +typedef void (JSONLexerEmitter)(JSONLexer *, GString *, JSONTokenType, int x, int y);
>  
>  struct JSONLexer
>  {
>      JSONLexerEmitter *emit;
>      int state;
> -    QString *token;
> +    GString *token;
>      int x, y;
>  };
>  
> diff --git a/include/qapi/qmp/json-streamer.h b/include/qapi/qmp/json-streamer.h
> index 823f7d7..e901144 100644
> --- a/include/qapi/qmp/json-streamer.h
> +++ b/include/qapi/qmp/json-streamer.h
> @@ -14,6 +14,7 @@
>  #ifndef QEMU_JSON_STREAMER_H
>  #define QEMU_JSON_STREAMER_H
>  
> +#include <stdint.h>

I guess you need this because json-lexer.h no longer includes it via
qstring.h.  The only use appars to be uint64_t token_size, where size_t
would be more appropriate.  Follow-up cleanup.

>  #include "qapi/qmp/qlist.h"
>  #include "qapi/qmp/json-lexer.h"
>  
> diff --git a/qobject/json-lexer.c b/qobject/json-lexer.c
> index b19623e..c2060f8 100644
> --- a/qobject/json-lexer.c
> +++ b/qobject/json-lexer.c
> @@ -11,12 +11,9 @@
>   *
>   */
>  
> -#include "qapi/qmp/qstring.h"
> -#include "qapi/qmp/qlist.h"
> -#include "qapi/qmp/qdict.h"
> -#include "qapi/qmp/qint.h"
>  #include "qemu-common.h"
>  #include "qapi/qmp/json-lexer.h"
> +#include <stdint.h>
>  
>  #define MAX_TOKEN_SIZE (64ULL << 20)
>  
> @@ -272,7 +269,7 @@ void json_lexer_init(JSONLexer *lexer, JSONLexerEmitter func)
>  {
>      lexer->emit = func;
>      lexer->state = IN_START;
> -    lexer->token = qstring_new();
> +    lexer->token = g_string_sized_new(3);

Where does the 3 come from?

As long as g_string_truncate() doesn't reallocate, lexer->token will
grow to hold the largest token seen so far, and stay there.

Why not MAX_TOKEN_SIZE and be done with it?  Oh, it's 64 MiB.  Ookay,
I'd call that... generous.

>      lexer->x = lexer->y = 0;
>  }
>  
> @@ -290,7 +287,7 @@ static int json_lexer_feed_char(JSONLexer *lexer, char ch, bool flush)
>          new_state = json_lexer[lexer->state][(uint8_t)ch];
>          char_consumed = !TERMINAL_NEEDED_LOOKAHEAD(lexer->state, new_state);
>          if (char_consumed) {
> -            qstring_append_chr(lexer->token, ch);
> +            g_string_append_c(lexer->token, ch);
>          }
>  
>          switch (new_state) {
> @@ -303,8 +300,7 @@ static int json_lexer_feed_char(JSONLexer *lexer, char ch, bool flush)
>              lexer->emit(lexer, lexer->token, new_state, lexer->x, lexer->y);
>              /* fall through */
>          case JSON_SKIP:
> -            QDECREF(lexer->token);
> -            lexer->token = qstring_new();
> +            g_string_truncate(lexer->token, 0);
>              new_state = IN_START;
>              break;

Before your patch, we effectively hand QString * lexer->token off to
whatever emit() is.

Now, it's a GString, and we want it back.

Okay, because below the obfuscating abstraction, emit() is
json_message_process_token(), and you update it accordingly, see last
patch hunk.

>          case IN_ERROR:
> @@ -322,8 +318,7 @@ static int json_lexer_feed_char(JSONLexer *lexer, char ch, bool flush)
>               * induce an error/flush state.
>               */
>              lexer->emit(lexer, lexer->token, JSON_ERROR, lexer->x, lexer->y);
> -            QDECREF(lexer->token);
> -            lexer->token = qstring_new();
> +            g_string_truncate(lexer->token, 0);
>              new_state = IN_START;
>              lexer->state = new_state;
>              return 0;

Likewise.

> @@ -336,10 +331,9 @@ static int json_lexer_feed_char(JSONLexer *lexer, char ch, bool flush)
>      /* Do not let a single token grow to an arbitrarily large size,
>       * this is a security consideration.
>       */
> -    if (lexer->token->length > MAX_TOKEN_SIZE) {
> +    if (lexer->token->len > MAX_TOKEN_SIZE) {
>          lexer->emit(lexer, lexer->token, lexer->state, lexer->x, lexer->y);
> -        QDECREF(lexer->token);
> -        lexer->token = qstring_new();
> +        g_string_truncate(lexer->token, 0);
>          lexer->state = IN_START;
>      }
>  

Likewise.

Preexisting: splitting long tokens with an axe like that is wrong.  If
the token is too long, we should error out.  Outside this patch's scope.

> @@ -369,5 +363,5 @@ int json_lexer_flush(JSONLexer *lexer)
>  
>  void json_lexer_destroy(JSONLexer *lexer)
>  {
> -    QDECREF(lexer->token);
> +    g_string_free(lexer->token, true);
>  }
> diff --git a/qobject/json-streamer.c b/qobject/json-streamer.c
> index 1b2f9b1..f240501 100644
> --- a/qobject/json-streamer.c
> +++ b/qobject/json-streamer.c
> @@ -12,6 +12,7 @@
>   */
>  
>  #include "qapi/qmp/qlist.h"
> +#include "qapi/qmp/qstring.h"
>  #include "qapi/qmp/qint.h"
>  #include "qapi/qmp/qdict.h"
>  #include "qemu-common.h"
> @@ -21,13 +22,13 @@
>  #define MAX_TOKEN_SIZE (64ULL << 20)
>  #define MAX_NESTING (1ULL << 10)
>  
> -static void json_message_process_token(JSONLexer *lexer, QString *token, JSONTokenType type, int x, int y)
> +static void json_message_process_token(JSONLexer *lexer, GString *input, JSONTokenType type, int x, int y)
>  {
>      JSONMessageParser *parser = container_of(lexer, JSONMessageParser, lexer);
>      QDict *dict;
>  
>      if (type == JSON_OPERATOR) {
> -        switch (qstring_get_str(token)[0]) {
> +        switch (input->str[0]) {
>          case '{':
>              parser->brace_count++;
>              break;
> @@ -47,12 +48,11 @@ static void json_message_process_token(JSONLexer *lexer, QString *token, JSONTok
>  
>      dict = qdict_new();
>      qdict_put(dict, "type", qint_from_int(type));
> -    QINCREF(token);
> -    qdict_put(dict, "token", token);
> +    qdict_put(dict, "token", qstring_from_str(input->str));
>      qdict_put(dict, "x", qint_from_int(x));
>      qdict_put(dict, "y", qint_from_int(y));
>  
> -    parser->token_size += token->length;
> +    parser->token_size += input->len;
>  
>      qlist_append(parser->tokens, dict);

Before your patch: take a reference to token.

Now: copy it out.  Good.

Preexisting: storing the token's textual representation is necessary for
strings, defensible for numbers, and a total waste for everything else.
But we can tolerate some waste.  Just not the *egregious* waste this
series fixes.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH v2 1/4] qjson: replace QString in JSONLexer with GString
  2015-11-25 12:48   ` Markus Armbruster
@ 2015-11-25 13:34     ` Paolo Bonzini
  0 siblings, 0 replies; 25+ messages in thread
From: Paolo Bonzini @ 2015-11-25 13:34 UTC (permalink / raw)
  To: Markus Armbruster; +Cc: qemu-devel

On 25/11/2015 13:48, Markus Armbruster wrote:
> Where does the 3 come from?

My hat. :)  I don't know---it was in the old patches, and I just brought
it on.

Oh, found it.  GString adds 1 to the size (for the \0); allocating 4
bytes probably seemed like a good idea at the time.

However, it also computes the power of two that is >= size + 1, and
g_string_new uses 2 as the default length so it _also_ results in
allocating four bytes (ceil_pow2(2+1) == 4).

Paolo

> As long as g_string_truncate() doesn't reallocate, lexer->token will
> grow to hold the largest token seen so far, and stay there.
> 
> Why not MAX_TOKEN_SIZE and be done with it?  Oh, it's 64 MiB.  Ookay,
> I'd call that... generous.
> 

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH v2 2/4] qjson: do not save/restore contexts
  2015-11-23 17:44 ` [Qemu-devel] [PATCH v2 2/4] qjson: do not save/restore contexts Paolo Bonzini
  2015-11-23 17:59   ` Laszlo Ersek
@ 2015-11-25 14:32   ` Markus Armbruster
  1 sibling, 0 replies; 25+ messages in thread
From: Markus Armbruster @ 2015-11-25 14:32 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel

Paolo Bonzini <pbonzini@redhat.com> writes:

> JSON is LL(1) and our parser indeed needs only 1 token lookahead.
> Saving the parser context is mostly unnecessary;

"Mit Kanonen auf Spatzen" (shooting cannons at sparrows).

>                                                  we can replace it
> with peeking at the next token, or remove it altogether when the
> restore only happens on errors.  The token list is destroyed anyway
> on errors.
>
> The only interesting thing is that parse_keyword always eats
> a TOKEN_KEYWORD, even if it is invalid, so it must come last in
> parse_value (otherwise, NULL is returned, parse_literal is invoked
> and it tries to peek beyond end of input).  This is caught by
> /errors/unterminated/literal, which actually checks for an unterminated
> keyword. ಠ_ಠ
>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  qobject/json-parser.c | 59 ++++++++++++++++++---------------------------------
>  1 file changed, 21 insertions(+), 38 deletions(-)
>
> diff --git a/qobject/json-parser.c b/qobject/json-parser.c
> index ac991ba..7a287ea 100644
> --- a/qobject/json-parser.c
> +++ b/qobject/json-parser.c
> @@ -296,23 +296,6 @@ static QObject *parser_context_peek_token(JSONParserContext *ctxt)
>      return token;
>  }
>  
> -static JSONParserContext parser_context_save(JSONParserContext *ctxt)
> -{
> -    JSONParserContext saved_ctxt = {0};
> -    saved_ctxt.tokens.pos = ctxt->tokens.pos;
> -    saved_ctxt.tokens.count = ctxt->tokens.count;
> -    saved_ctxt.tokens.buf = ctxt->tokens.buf;
> -    return saved_ctxt;
> -}
> -
> -static void parser_context_restore(JSONParserContext *ctxt,
> -                                   JSONParserContext saved_ctxt)
> -{
> -    ctxt->tokens.pos = saved_ctxt.tokens.pos;
> -    ctxt->tokens.count = saved_ctxt.tokens.count;
> -    ctxt->tokens.buf = saved_ctxt.tokens.buf;
> -}
> -

This saves and restores tokens, which is an array @buf of @count tokens
with a cursor @pos.

@buf and count only ever change in parser_context_new().  Saving and
restoring them has always been pointless.

What this actually does is saving and restoring the cursor.

>  static void tokens_append_from_iter(QObject *obj, void *opaque)
>  {
>      JSONParserContext *ctxt = opaque;
> @@ -364,7 +347,6 @@ static void parser_context_free(JSONParserContext *ctxt)
>  static int parse_pair(JSONParserContext *ctxt, QDict *dict, va_list *ap)
>  {
>      QObject *key = NULL, *token = NULL, *value, *peek;
> -    JSONParserContext saved_ctxt = parser_context_save(ctxt);
>  
>      peek = parser_context_peek_token(ctxt);
>      if (peek == NULL) {
> @@ -402,7 +384,6 @@ static int parse_pair(JSONParserContext *ctxt, QDict *dict, va_list *ap)
>      return 0;
>  
>  out:
> -    parser_context_restore(ctxt, saved_ctxt);
>      qobject_decref(key);
>  
>      return -1;

Before your patch: on error, we rewind the cursor to where it was on
entry.  Useful in a backtracking parser, but this shouldn't be one.

Now: we leave it wherever we error out.

Fine, because the sole caller parse_object() won't actually use it on
error.

> @@ -412,9 +393,8 @@ static QObject *parse_object(JSONParserContext *ctxt, va_list *ap)
>  {
>      QDict *dict = NULL;
>      QObject *token, *peek;
> -    JSONParserContext saved_ctxt = parser_context_save(ctxt);
>  
> -    token = parser_context_pop_token(ctxt);
> +    token = parser_context_peek_token(ctxt);
>      if (token == NULL) {
>          goto out;
>      }
> @@ -425,6 +405,7 @@ static QObject *parse_object(JSONParserContext *ctxt, va_list *ap)
>  
>      dict = qdict_new();
>  
> +    parser_context_pop_token(ctxt);
>      peek = parser_context_peek_token(ctxt);
>      if (peek == NULL) {
>          parse_error(ctxt, NULL, "premature EOI");
> @@ -465,7 +446,6 @@ static QObject *parse_object(JSONParserContext *ctxt, va_list *ap)
>      return QOBJECT(dict);
>  
>  out:
> -    parser_context_restore(ctxt, saved_ctxt);
>      QDECREF(dict);
>      return NULL;
>  }

I'm not 100% when exactly parser_context_peek_token() returns null, but
I think can see what your patch does anyway.

Before: if we can parse an object, advance cursor behind the object and
return it, else leave the cursor alone and return null.  The sole caller
parse_value() relies on this behavior to try alternatives until one
succeeds.

After: if we can parse an object, same as before, else advance the
cursor to right before the offending token and return null.

Consider QMP input { 1 [ 2 ] }.

    json_parser_parse_err():
        parse_value() @ pos=0:
            triy parse_object() @ pos=0:
                consume '{'
                parse_pair() @ pos=1:
                    parse_value() @ pos=1:
                        try ..., consume 1, now pos=2
                    fail with "key is not a string in object"
                propagate failure
            try parse_array() @ pos=2:
                consume [ 2 ], return the QList, now pos=5
            return the QList
        throw away the error (caller json_parser_parse() passed errp=NULL)
        return the QList

Fortunately, that's not valid QMP, and we get

    {"error": {"class": "GenericError", "desc": "Expected 'object' in QMP input"}}

Shouldn't be hard to fix. we just have to complete the job of turning
this thing into a recursive descent parser.  I'll give it a shot.

[...]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH v2 for-2.5? 0/4] qjson: save a lot of memory
  2015-11-23 17:44 [Qemu-devel] [PATCH v2 for-2.5? 0/4] qjson: save a lot of memory Paolo Bonzini
                   ` (4 preceding siblings ...)
  2015-11-23 21:00 ` [Qemu-devel] [PATCH v2 for-2.5? 0/4] qjson: save a lot of memory Eric Blake
@ 2015-11-25 14:47 ` Markus Armbruster
  2015-11-25 18:08   ` Paolo Bonzini
  5 siblings, 1 reply; 25+ messages in thread
From: Markus Armbruster @ 2015-11-25 14:47 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel

Paolo Bonzini <pbonzini@redhat.com> writes:

> This patch from 2011 (!) saves about 96% of the allocation cost (down
> from 500 MiB to 20 MiB) for check-qjson.

Looks good to me apart from the incomplete transition to recursive
descent in PATCH 2.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH v2 for-2.5? 0/4] qjson: save a lot of memory
  2015-11-25 14:47 ` Markus Armbruster
@ 2015-11-25 18:08   ` Paolo Bonzini
  2015-11-25 18:34     ` Markus Armbruster
  0 siblings, 1 reply; 25+ messages in thread
From: Paolo Bonzini @ 2015-11-25 18:08 UTC (permalink / raw)
  To: Markus Armbruster; +Cc: qemu-devel

On 25/11/2015 15:47, Markus Armbruster wrote:
> > This patch from 2011 (!) saves about 96% of the allocation cost (down
> > from 500 MiB to 20 MiB) for check-qjson.
> 
> Looks good to me apart from the incomplete transition to recursive
> descent in PATCH 2.

I do not understand whether that is a problem.  Is the remark from the
commit message incorrect ("Saving the parser context is mostly
unnecessary; we can replace it with peeking at the next token, or remove
it altogether when the restore only happens on errors.  The token list
is destroyed anyway on errors.")?

Paolo

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH v2 for-2.5? 0/4] qjson: save a lot of memory
  2015-11-25 18:08   ` Paolo Bonzini
@ 2015-11-25 18:34     ` Markus Armbruster
  0 siblings, 0 replies; 25+ messages in thread
From: Markus Armbruster @ 2015-11-25 18:34 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel

Paolo Bonzini <pbonzini@redhat.com> writes:

> On 25/11/2015 15:47, Markus Armbruster wrote:
>> > This patch from 2011 (!) saves about 96% of the allocation cost (down
>> > from 500 MiB to 20 MiB) for check-qjson.
>> 
>> Looks good to me apart from the incomplete transition to recursive
>> descent in PATCH 2.
>
> I do not understand whether that is a problem.  Is the remark from the
> commit message incorrect ("Saving the parser context is mostly
> unnecessary; we can replace it with peeking at the next token, or remove
> it altogether when the restore only happens on errors.  The token list
> is destroyed anyway on errors.")?

The token sequence

    { 1 [ 2 ] }

gets parsed as [ 2 ].  See my reply to PATCH 2 for a more detailed
explanation.

I didn't bother to construct an example where QMP executes the incorrect
result, and I'm not sure it's possible.

Regardless, I'm fixing it, by simplifying the parser some more.

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2015-11-25 18:34 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-11-23 17:44 [Qemu-devel] [PATCH v2 for-2.5? 0/4] qjson: save a lot of memory Paolo Bonzini
2015-11-23 17:44 ` [Qemu-devel] [PATCH v2 1/4] qjson: replace QString in JSONLexer with GString Paolo Bonzini
2015-11-25 12:48   ` Markus Armbruster
2015-11-25 13:34     ` Paolo Bonzini
2015-11-23 17:44 ` [Qemu-devel] [PATCH v2 2/4] qjson: do not save/restore contexts Paolo Bonzini
2015-11-23 17:59   ` Laszlo Ersek
2015-11-23 18:09     ` Paolo Bonzini
2015-11-23 19:18       ` Laszlo Ersek
2015-11-23 20:05     ` Eric Blake
2015-11-23 20:26       ` Laszlo Ersek
2015-11-24  8:03     ` Gerd Hoffmann
2015-11-24 10:50       ` Laszlo Ersek
2015-11-24 11:18         ` Paolo Bonzini
2015-11-24 12:44           ` Fam Zheng
2015-11-24 12:54             ` Paolo Bonzini
2015-11-24 13:15               ` Markus Armbruster
2015-11-24 11:33         ` Gerd Hoffmann
2015-11-24 11:39         ` Laszlo Ersek
2015-11-25 14:32   ` Markus Armbruster
2015-11-23 17:44 ` [Qemu-devel] [PATCH v2 3/4] qjson: store tokens in a GQueue Paolo Bonzini
2015-11-23 17:44 ` [Qemu-devel] [PATCH v2 4/4] qjson: surprise, allocating 6 QObjects per token is expensive Paolo Bonzini
2015-11-23 21:00 ` [Qemu-devel] [PATCH v2 for-2.5? 0/4] qjson: save a lot of memory Eric Blake
2015-11-25 14:47 ` Markus Armbruster
2015-11-25 18:08   ` Paolo Bonzini
2015-11-25 18:34     ` Markus Armbruster

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).