From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1MJXQ6-0007QY-N8 for qemu-devel@nongnu.org; Wed, 24 Jun 2009 14:42:06 -0400 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1MJXQ2-0007MG-33 for qemu-devel@nongnu.org; Wed, 24 Jun 2009 14:42:06 -0400 Received: from [199.232.76.173] (port=40061 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1MJXQ1-0007MC-Vs for qemu-devel@nongnu.org; Wed, 24 Jun 2009 14:42:01 -0400 Received: from mail2.shareable.org ([80.68.89.115]:48467) by monty-python.gnu.org with esmtps (TLS-1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.60) (envelope-from ) id 1MJXQ1-0002jn-7s for qemu-devel@nongnu.org; Wed, 24 Jun 2009 14:42:01 -0400 Date: Wed, 24 Jun 2009 19:41:58 +0100 From: Jamie Lokier Subject: Re: [Qemu-devel] [PATCH 01/11] QMP: Introduce specification file Message-ID: <20090624184158.GN14121@shareable.org> References: <4A412339.5000109@redhat.com> <4A412659.1080803@us.ibm.com> <20090623220204.GA5612@snarc.org> <4A415C30.7030301@us.ibm.com> <20090624010108.GA6537@snarc.org> <4A42200C.6060600@codemonkey.ws> <4A422592.2000307@redhat.com> <20090624162207.GD14121@shareable.org> <20090624173915.GA16973@snarc.org> <5b31733c0906241123w14dde78dk670ff8f5f83f4c97@mail.gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <5b31733c0906241123w14dde78dk670ff8f5f83f4c97@mail.gmail.com> List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Filip Navara Cc: ehabkost@redhat.com, jan.kiszka@siemens.com, dlaor@redhat.com, qemu-devel@nongnu.org, Luiz Capitulino , Avi Kivity , Vincent Hanquez > On Wed, Jun 24, 2009 at 7:39 PM, Vincent Hanquez wrote: > > On Wed, Jun 24, 2009 at 05:22:07PM +0100, Jamie Lokier wrote: > >> You can code a minimal XML parser in straight C quite easily, if it's > >> a restricted subset. > > > > even the restricted subset is not as straighforward as a json parser. and > > usually using a subset means you can't interact correctly with the one that > > does the full spec. > > > >> XML and JSON both have the same ugly problem with binary data: they > >> can't carry it.  It's usually base64 encoded.  Then again the QEMU > >> monitor is no better this respect :-) > > > > JSon ***DOES*** do binary data. > > > > C String "abc\0\xff" -> Json String "abc\0000\00ff" Any reason you can't simply put the UTF-8 encoding of U+0000 and U+00FF directly in the string? (Let's ignore Java, which encodes U+0000 in "Java-UTF-8" differently from everyone else's UTF-8!) Filip Navara wrote: > I find the Json representation problematic. In C you have two distinct > data types - null-terminated string where the length is implicitly > known from the content (char *string) and a binary data blob (char > *buffer, int size). If you encode them into the same JSON data type > and don't supply "out-of-band" information about which one of the C > types is it, the receiver has no way to decide what to decode it into. > JSONRPC allows supplying this "out-of-band" information only for the > JSON data types which is very limiting. > > For text based protocols it's vital to separate the syntax from > semantics and decoding the above would require knowning the specific > context and semantics. > > A more natural representation of binary blob in JSON would be array of > numbers, but that would have a big overhead. Actually, an array doesn't add much more overhead :-) Binary has a big overhead in JSON as a string. Encode the blob [0,1,2,3,252,253,254,255]: JSON: "\0\1\2\3\xfc\xfd\xfe\xff" That's fine if you've just got a few non-ASCII bytes. But for general binary, base64 is much more compact. Base64 expands by about 4/3, whereas a JSON \x-encoded string expands by up to 4 times. Even a hex string is more compact :-) A further complication is that a JSON string carries Unicode (which is good for text), so at some point you have to know to signal a parse error when any characters are outside the range 0-255. But I don't know why we're talking about binary, as the monitor (old or proposed) doesn't handle binary either, and no particular need for binary has come up... -- Jamie