From: Markus Armbruster <armbru@redhat.com>
To: qemu-devel@nongnu.org
Cc: marcandre.lureau@redhat.com, mdroth@linux.vnet.ibm.com,
eblake@redhat.com
Subject: [Qemu-devel] [PATCH 18/56] json: Revamp lexer documentation
Date: Wed, 8 Aug 2018 14:02:56 +0200 [thread overview]
Message-ID: <20180808120334.10970-19-armbru@redhat.com> (raw)
In-Reply-To: <20180808120334.10970-1-armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
---
qobject/json-lexer.c | 80 +++++++++++++++++++++++++++++++++++++++-----
1 file changed, 71 insertions(+), 9 deletions(-)
diff --git a/qobject/json-lexer.c b/qobject/json-lexer.c
index e85e9a78ff..109a7d8bb8 100644
--- a/qobject/json-lexer.c
+++ b/qobject/json-lexer.c
@@ -18,21 +18,83 @@
#define MAX_TOKEN_SIZE (64ULL << 20)
/*
- * Required by JSON (RFC 7159):
+ * From RFC 7159 "The JavaScript Object Notation (JSON) Data
+ * Interchange Format", with [comments in brackets]:
*
- * \"([^\\\"]|\\[\"'\\/bfnrt]|\\u[0-9a-fA-F]{4})*\"
- * -?(0|[1-9][0-9]*)(.[0-9]+)?([eE][-+]?[0-9]+)?
- * [{}\[\],:]
- * [a-z]+ # covers null, true, false
+ * The set of tokens includes six structural characters, strings,
+ * numbers, and three literal names.
*
- * Extension of '' strings:
+ * These are the six structural characters:
*
- * '([^\\']|\\[\"'\\/bfnrt]|\\u[0-9a-fA-F]{4})*'
+ * begin-array = ws %x5B ws ; [ left square bracket
+ * begin-object = ws %x7B ws ; { left curly bracket
+ * end-array = ws %x5D ws ; ] right square bracket
+ * end-object = ws %x7D ws ; } right curly bracket
+ * name-separator = ws %x3A ws ; : colon
+ * value-separator = ws %x2C ws ; , comma
*
- * Extension for vararg handling in JSON construction:
+ * Insignificant whitespace is allowed before or after any of the six
+ * structural characters.
+ * [This lexer accepts it before or after any token, which is actually
+ * the same, as the grammar always has structural characters between
+ * other tokens.]
*
- * %((l|ll|I64)?d|[ipsf])
+ * ws = *(
+ * %x20 / ; Space
+ * %x09 / ; Horizontal tab
+ * %x0A / ; Line feed or New line
+ * %x0D ) ; Carriage return
*
+ * [...] three literal names:
+ * false null true
+ * [This lexer accepts [a-z]+, and leaves rejecting unknown literal
+ * names to the parser.]
+ *
+ * [Numbers:]
+ *
+ * number = [ minus ] int [ frac ] [ exp ]
+ * decimal-point = %x2E ; .
+ * digit1-9 = %x31-39 ; 1-9
+ * e = %x65 / %x45 ; e E
+ * exp = e [ minus / plus ] 1*DIGIT
+ * frac = decimal-point 1*DIGIT
+ * int = zero / ( digit1-9 *DIGIT )
+ * minus = %x2D ; -
+ * plus = %x2B ; +
+ * zero = %x30 ; 0
+ *
+ * [Strings:]
+ * string = quotation-mark *char quotation-mark
+ *
+ * char = unescaped /
+ * escape (
+ * %x22 / ; " quotation mark U+0022
+ * %x5C / ; \ reverse solidus U+005C
+ * %x2F / ; / solidus U+002F
+ * %x62 / ; b backspace U+0008
+ * %x66 / ; f form feed U+000C
+ * %x6E / ; n line feed U+000A
+ * %x72 / ; r carriage return U+000D
+ * %x74 / ; t tab U+0009
+ * %x75 4HEXDIG ) ; uXXXX U+XXXX
+ * escape = %x5C ; \
+ * quotation-mark = %x22 ; "
+ * unescaped = %x20-21 / %x23-5B / %x5D-10FFFF
+ *
+ *
+ * Extensions over RFC 7159:
+ * - Extra escape sequence in strings:
+ * 0x27 (apostrophe) is recognized after escape, too
+ * - Single-quoted strings:
+ * Like double-quoted strings, except they're delimited by %x27
+ * (apostrophe) instead of %x22 (quotation mark), and can't contain
+ * unescaped apostrophe, but can contain unescaped quotation mark.
+ * - Interpolation:
+ * interpolation = %((l|ll|I64)[du]|[ipsf])
+ *
+ * Note:
+ * - Input must be encoded in UTF-8.
+ * - Decoding and validating is left to the parser.
*/
enum json_lexer_state {
--
2.17.1
next prev parent reply other threads:[~2018-08-08 12:03 UTC|newest]
Thread overview: 162+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-08-08 12:02 [Qemu-devel] [PATCH 00/56] json: Fixes, error reporting improvements, cleanups Markus Armbruster
2018-08-08 12:02 ` [Qemu-devel] [PATCH 01/56] check-qjson: Cover multiple JSON objects in same string Markus Armbruster
2018-08-09 13:25 ` Eric Blake
2018-08-08 12:02 ` [Qemu-devel] [PATCH 02/56] check-qjson: Cover blank and lexically erroneous input Markus Armbruster
2018-08-09 13:29 ` Eric Blake
2018-08-10 13:40 ` Markus Armbruster
2018-08-08 12:02 ` [Qemu-devel] [PATCH 03/56] check-qjson: Cover whitespace more thoroughly Markus Armbruster
2018-08-09 13:36 ` Eric Blake
2018-08-10 13:43 ` Markus Armbruster
2018-08-08 12:02 ` [Qemu-devel] [PATCH 04/56] qmp-cmd-test: Split off qmp-test Markus Armbruster
2018-08-09 13:38 ` Eric Blake
2018-08-10 13:49 ` Markus Armbruster
2018-08-08 12:02 ` [Qemu-devel] [PATCH 05/56] qmp-test: Cover syntax and lexical errors Markus Armbruster
2018-08-09 13:42 ` Eric Blake
2018-08-10 13:52 ` Markus Armbruster
2018-08-10 14:06 ` Eric Blake
2018-08-16 12:44 ` Markus Armbruster
2018-08-08 12:02 ` [Qemu-devel] [PATCH 06/56] test-qga: Clean up how we test QGA synchronization Markus Armbruster
2018-08-09 13:46 ` Eric Blake
2018-08-10 13:57 ` Markus Armbruster
2018-08-08 12:02 ` [Qemu-devel] [PATCH 07/56] check-qjson: Cover escaped characters more thoroughly, part 1 Markus Armbruster
2018-08-09 13:54 ` Eric Blake
2018-08-10 14:03 ` Markus Armbruster
2018-08-09 14:00 ` Eric Blake
2018-08-10 14:11 ` Markus Armbruster
2018-08-08 12:02 ` [Qemu-devel] [PATCH 08/56] check-qjson: Streamline escaped_string()'s test strings Markus Armbruster
2018-08-09 13:57 ` Eric Blake
2018-08-10 14:15 ` Markus Armbruster
2018-08-08 12:02 ` [Qemu-devel] [PATCH 09/56] check-qjson: Cover escaped characters more thoroughly, part 2 Markus Armbruster
2018-08-09 14:03 ` Eric Blake
2018-08-10 14:16 ` Markus Armbruster
2018-08-08 12:02 ` [Qemu-devel] [PATCH 10/56] check-qjson: Drop redundant string tests Markus Armbruster
2018-08-09 14:04 ` Eric Blake
2018-08-08 12:02 ` [Qemu-devel] [PATCH 11/56] check-qjson: Cover UTF-8 in single quoted strings Markus Armbruster
2018-08-09 14:17 ` Eric Blake
2018-08-10 14:18 ` Markus Armbruster
2018-08-10 14:59 ` Eric Blake
2018-08-13 6:11 ` Markus Armbruster
2018-08-13 14:53 ` Eric Blake
2018-08-14 6:01 ` Markus Armbruster
2018-08-08 12:02 ` [Qemu-devel] [PATCH 12/56] check-qjson: Simplify utf8_string() Markus Armbruster
2018-08-09 14:20 ` Eric Blake
2018-08-08 12:02 ` [Qemu-devel] [PATCH 13/56] check-qjson: Fix utf8_string() to test all invalid sequences Markus Armbruster
2018-08-09 14:22 ` Eric Blake
2018-08-08 12:02 ` [Qemu-devel] [PATCH 14/56] check-qjson qmp-test: Cover control characters more thoroughly Markus Armbruster
2018-08-09 17:24 ` Eric Blake
2018-08-08 12:02 ` [Qemu-devel] [PATCH 15/56] check-qjson: Cover interpolation " Markus Armbruster
2018-08-09 17:26 ` Eric Blake
2018-08-08 12:02 ` [Qemu-devel] [PATCH 16/56] json: Fix lexer to include the bad character in JSON_ERROR token Markus Armbruster
2018-08-09 17:42 ` Eric Blake
2018-08-08 12:02 ` [Qemu-devel] [PATCH 17/56] json: Reject unescaped control characters Markus Armbruster
2018-08-09 18:26 ` Eric Blake
2018-08-10 14:26 ` Markus Armbruster
2018-08-08 12:02 ` Markus Armbruster [this message]
2018-08-09 18:49 ` [Qemu-devel] [PATCH 18/56] json: Revamp lexer documentation Eric Blake
2018-08-10 14:31 ` Markus Armbruster
2018-08-10 15:02 ` Eric Blake
2018-08-13 6:12 ` Markus Armbruster
2018-08-08 12:02 ` [Qemu-devel] [PATCH 19/56] json: Tighten and simplify qstring_from_escaped_str()'s loop Markus Armbruster
2018-08-09 18:52 ` Eric Blake
2018-08-08 12:02 ` [Qemu-devel] [PATCH 20/56] check-qjson: Document we expect invalid UTF-8 to be rejected Markus Armbruster
2018-08-09 18:55 ` Eric Blake
2018-08-08 12:02 ` [Qemu-devel] [PATCH 21/56] json: Reject invalid UTF-8 sequences Markus Armbruster
2018-08-09 22:16 ` Eric Blake
2018-08-10 14:40 ` Markus Armbruster
2018-08-10 15:21 ` Eric Blake
2018-08-16 14:50 ` Markus Armbruster
2018-08-08 12:03 ` [Qemu-devel] [PATCH 22/56] json: Report first rather than last parse error Markus Armbruster
2018-08-10 15:25 ` Eric Blake
2018-08-08 12:03 ` [Qemu-devel] [PATCH 23/56] json: Leave rejecting invalid UTF-8 to parser Markus Armbruster
2018-08-10 15:36 ` Eric Blake
2018-08-08 12:03 ` [Qemu-devel] [PATCH 24/56] json: Accept overlong \xC0\x80 as U+0000 ("modified UTF-8") Markus Armbruster
2018-08-10 15:48 ` Eric Blake
2018-08-10 16:09 ` Eric Blake
2018-08-13 7:00 ` Markus Armbruster
2018-08-13 14:57 ` Eric Blake
2018-08-14 6:07 ` Markus Armbruster
2018-08-17 7:18 ` Markus Armbruster
2018-08-08 12:03 ` [Qemu-devel] [PATCH 25/56] json: Leave rejecting invalid escape sequences to parser Markus Armbruster
2018-08-10 15:56 ` Eric Blake
2018-08-13 7:05 ` Markus Armbruster
2018-08-13 14:58 ` Eric Blake
2018-08-08 12:03 ` [Qemu-devel] [PATCH 26/56] json: Simplify parse_string() Markus Armbruster
2018-08-10 15:59 ` Eric Blake
2018-08-08 12:03 ` [Qemu-devel] [PATCH 27/56] json: Reject invalid \uXXXX, fix \u0000 Markus Armbruster
2018-08-10 16:10 ` Eric Blake
2018-08-08 12:03 ` [Qemu-devel] [PATCH 28/56] json: Fix \uXXXX for surrogate pairs Markus Armbruster
2018-08-10 17:18 ` Eric Blake
2018-08-13 7:07 ` Markus Armbruster
2018-08-12 9:52 ` Paolo Bonzini
2018-08-13 7:12 ` Markus Armbruster
2018-08-08 12:03 ` [Qemu-devel] [PATCH 29/56] check-qjson: Fix and enable utf8_string()'s disabled part Markus Armbruster
2018-08-10 17:19 ` Eric Blake
2018-08-08 12:03 ` [Qemu-devel] [PATCH 30/56] json: remove useless return value from lexer/parser Markus Armbruster
2018-08-08 12:03 ` [Qemu-devel] [PATCH 31/56] json-parser: simplify and avoid JSONParserContext allocation Markus Armbruster
2018-08-08 12:03 ` [Qemu-devel] [PATCH 32/56] json: Have lexer call streamer directly Markus Armbruster
2018-08-10 17:22 ` Eric Blake
2018-08-08 12:03 ` [Qemu-devel] [PATCH 33/56] json: Redesign the callback to consume JSON values Markus Armbruster
2018-08-13 15:30 ` Eric Blake
2018-08-08 12:03 ` [Qemu-devel] [PATCH 34/56] json: Don't pass null @tokens to json_parser_parse() Markus Armbruster
2018-08-13 15:32 ` Eric Blake
2018-08-14 6:17 ` Markus Armbruster
2018-08-08 12:03 ` [Qemu-devel] [PATCH 35/56] json: Don't create JSON_ERROR tokens that won't be used Markus Armbruster
2018-08-13 15:32 ` Eric Blake
2018-08-08 12:03 ` [Qemu-devel] [PATCH 36/56] json: Rename token JSON_ESCAPE & friends to JSON_INTERPOL Markus Armbruster
2018-08-13 15:34 ` Eric Blake
2018-08-14 6:28 ` Markus Armbruster
2018-08-08 12:03 ` [Qemu-devel] [PATCH 37/56] json: Treat unwanted interpolation as lexical error Markus Armbruster
2018-08-13 15:48 ` Eric Blake
2018-08-14 6:51 ` Markus Armbruster
2018-08-08 12:03 ` [Qemu-devel] [PATCH 38/56] json: Pass lexical errors and limit violations to callback Markus Armbruster
2018-08-13 15:51 ` Eric Blake
2018-08-08 12:03 ` [Qemu-devel] [PATCH 39/56] json: Leave rejecting invalid interpolation to parser Markus Armbruster
2018-08-13 16:12 ` Eric Blake
2018-08-14 7:23 ` Markus Armbruster
2018-08-08 12:03 ` [Qemu-devel] [PATCH 40/56] json: Replace %I64d, %I64u by %PRId64, %PRIu64 Markus Armbruster
2018-08-13 16:18 ` Eric Blake
2018-08-14 7:24 ` Markus Armbruster
2018-08-08 12:03 ` [Qemu-devel] [PATCH 41/56] json: Nicer recovery from invalid leading zero Markus Armbruster
2018-08-13 16:33 ` Eric Blake
2018-08-14 8:24 ` Markus Armbruster
2018-08-14 13:14 ` Eric Blake
2018-08-08 12:03 ` [Qemu-devel] [PATCH 42/56] json: Improve names of lexer states related to numbers Markus Armbruster
2018-08-13 16:36 ` Eric Blake
2018-08-08 12:03 ` [Qemu-devel] [PATCH 43/56] qjson: Fix qobject_from_json() & friends for multiple values Markus Armbruster
2018-08-14 13:26 ` Eric Blake
2018-08-08 12:03 ` [Qemu-devel] [PATCH 44/56] json: Fix latent parser aborts at end of input Markus Armbruster
2018-08-16 13:10 ` Eric Blake
2018-08-16 15:19 ` Markus Armbruster
2018-08-08 12:03 ` [Qemu-devel] [PATCH 45/56] json: Fix streamer not to ignore trailing unterminated structures Markus Armbruster
2018-08-16 13:12 ` Eric Blake
2018-08-08 12:03 ` [Qemu-devel] [PATCH 46/56] json: Assert json_parser_parse() consumes all tokens on success Markus Armbruster
2018-08-16 13:13 ` Eric Blake
2018-08-08 12:03 ` [Qemu-devel] [PATCH 47/56] qjson: Have qobject_from_json() & friends reject empty and blank Markus Armbruster
2018-08-16 13:20 ` Eric Blake
2018-08-16 15:40 ` Markus Armbruster
2018-08-08 12:03 ` [Qemu-devel] [PATCH 48/56] json: Enforce token count and size limits more tightly Markus Armbruster
2018-08-16 13:22 ` Eric Blake
2018-08-08 12:03 ` [Qemu-devel] [PATCH 49/56] json: Streamline json_message_process_token() Markus Armbruster
2018-08-16 13:40 ` Eric Blake
2018-08-16 15:42 ` Markus Armbruster
2018-08-08 12:03 ` [Qemu-devel] [PATCH 50/56] json: Unbox tokens queue in JSONMessageParser Markus Armbruster
2018-08-16 13:42 ` Eric Blake
2018-08-08 12:03 ` [Qemu-devel] [PATCH 51/56] json: Eliminate lexer state IN_ERROR and pseudo-token JSON_MIN Markus Armbruster
2018-08-16 13:45 ` Eric Blake
2018-08-16 15:48 ` Markus Armbruster
2018-08-08 12:03 ` [Qemu-devel] [PATCH 52/56] json: Eliminate lexer state IN_WHITESPACE, pseudo-token JSON_SKIP Markus Armbruster
2018-08-16 13:51 ` Eric Blake
2018-08-08 12:03 ` [Qemu-devel] [PATCH 53/56] json: Make JSONToken opaque outside json-parser.c Markus Armbruster
2018-08-16 13:54 ` Eric Blake
2018-08-08 12:03 ` [Qemu-devel] [PATCH 54/56] qobject: Drop superfluous includes of qemu-common.h Markus Armbruster
2018-08-16 13:54 ` Eric Blake
2018-08-08 12:03 ` [Qemu-devel] [PATCH 55/56] json: Clean up headers Markus Armbruster
2018-08-16 17:50 ` Eric Blake
2018-08-17 8:22 ` Markus Armbruster
2018-08-08 12:03 ` [Qemu-devel] [PATCH 56/56] docs/interop/qmp-spec: How to force known good parser state Markus Armbruster
2018-08-10 14:30 ` Eric Blake
2018-08-17 8:37 ` Markus Armbruster
2018-08-17 14:34 ` Eric Blake
2018-08-17 11:16 ` Markus Armbruster
2018-08-17 14:35 ` Eric Blake
2018-08-08 14:03 ` [Qemu-devel] [PATCH 00/56] json: Fixes, error reporting improvements, cleanups Markus Armbruster
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180808120334.10970-19-armbru@redhat.com \
--to=armbru@redhat.com \
--cc=eblake@redhat.com \
--cc=marcandre.lureau@redhat.com \
--cc=mdroth@linux.vnet.ibm.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).