From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:38402) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WD1Kc-0007HM-Tt for qemu-devel@nongnu.org; Mon, 10 Feb 2014 19:36:15 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1WD1KY-0003Cd-EV for qemu-devel@nongnu.org; Mon, 10 Feb 2014 19:36:10 -0500 Received: from mx1.redhat.com ([209.132.183.28]:47208) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WD1KY-0003CY-6E for qemu-devel@nongnu.org; Mon, 10 Feb 2014 19:36:06 -0500 Message-ID: <52F9706F.2090508@redhat.com> Date: Mon, 10 Feb 2014 17:35:59 -0700 From: Eric Blake MIME-Version: 1.0 References: <1390488396-16538-1-git-send-email-akong@redhat.com> <1390488396-16538-3-git-send-email-akong@redhat.com> <52F03110.8040501@redhat.com> In-Reply-To: <52F03110.8040501@redhat.com> Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="W1X9EQTH9Is5X5R849aIqvbVwSFJIxT5n" Subject: Re: [Qemu-devel] [PATCH v4 2/5] qapi: add qapi-introspect.py code generator List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Amos Kong , qemu-devel@nongnu.org Cc: xiawenc@linux.vnet.ibm.com, qiaonuohan@cn.fujitsu.com, mdroth@linux.vnet.ibm.com, lcapitulino@redhat.com This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --W1X9EQTH9Is5X5R849aIqvbVwSFJIxT5n Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable On 02/03/2014 05:15 PM, Eric Blake wrote: > On 01/23/2014 07:46 AM, Amos Kong wrote: >> This is a code generator for qapi introspection. It will parse >> qapi-schema.json, extend schema definitions and generate a schema >> table with metadata, it references to the new structs which we used >> to describe dynamic data structs. The metadata will help C code to >> allocate right structs and provide useful information to management >> to checking suported feature and QMP commandline detail. The schema >=20 > s/suported/supported/ >=20 >> table will be saved to qapi-introspect.h. >> >=20 > I am NOT a python expert. But what I _can_ do is apply this patch and > review the generated code for sanity. >> +fdecl.write(mcgen(''' >> +/* AUTOMATICALLY GENERATED, DO NOT MODIFY */ >> + >> +/* >> + * Head file to store parsed information of QAPI schema s/Head/Header/ > Like I said, I think my review will be more helpful by looking at the > generated file; I'll follow up in another email (probably tomorrow, > since it's now late for me) with more comments once I actually finish t= hat. >=20 It took me longer than planned to get back to this. But here goes my impressions of the generated file: -rw-rw-r--. 1 eblake eblake 667643 Feb 10 16:16 qapi-introspect.h -rw-rw-r--. 1 eblake eblake 126261 Feb 7 17:12 qapi-schema.json -rw-rw-r--. 1 eblake eblake 80170 Feb 7 17:12 qapi-types.h Wow, that's a LOT of code. Why does it take 6 times more C code than what qapi itself represented everything in? Are we going too far with inlining type information? A larger file MIGHT be okay, if that's what it takes to make C code that is expressive of the information at hand (after all, the whole point of our qapi is to give us some shorthand, so that we can quickly define types in less syntax than C) - but I want to be sure that we really need that much content. For comparison, the generated qapi-types.h is smaller than the input; sure, that's in part due to the comments being stripped out of the input file, but it's evidence that the C code representation of qapi should be about the same size as the input file, not approaching an order of magnitude larger. const char *const qmp_schema_table[] =3D { /* OrderedDict([('enum', 'ErrorClass'), ('data', ['GenericError', 'CommandNotFound', 'DeviceEncrypted', 'DeviceNotActive', 'DeviceNotFound', 'KVMMissingCap'])]) */ "{'_obj_member': 'False', '_obj_type': 'enum', '_obj_name': 'ErrorClass', '_obj_data': {'data': ['GenericError', 'CommandNotFound', 'DeviceEncrypted', 'DeviceNotActive', 'DeviceNotFound', 'KVMMissingCap']}, '_obj_recursive': 'False'}", /* OrderedDict([('command', 'add_client'), ('data', OrderedDict([('protocol', 'str'), ('fdname', 'str'), ('*skipauth', 'bool'), ('*tls', 'bool')]))]) */ "{'_obj_member': 'False', '_obj_type': 'command', '_obj_name': 'add_client', '_obj_data': {'data': {'_obj_type': 'anonymous-struct', '_obj_member': 'True', '_obj_name': '', '_obj_data': {'*skipauth': 'bool', 'protocol': 'str', 'fdname': 'str', '*tls': 'bool'}, '_obj_recursive': 'False'}}, '_obj_recursive': 'False'}", Long lines! Just because it's generated doesn't mean it can't have nice line wraps. Make your generator output some strategic whitespace, so that a human perusing the file stands a chance of understanding it (look at qapi-types.h for comparison). No sorting? This looks like you just dumped the output in hash-table order. Please sort the array by type and/or name, so that if we add filtering, the C code can then do O(log n) lookup via bsearch rather than an O(n) linear crawl (or maybe even multiple tables, one per type, with each table sorted by name within that type). The comments before each string entry is redundant. Cut your file in half by listing only what the C compiler cares about - after all the information is supposed to be self-describing enough that if the comment actually added anything, we failed at introspecting enough useful information to the user. A flat-out array of pre-compiled strings? I guess it makes generating the output of your command a little faster (just replay the pre-computed strings, instead of having to stringify a QObject), but it is lousy if you ever have to process the data differently. I was totally expecting an array of structs. And not just any struct, but an array of the C struct that gets generated from the QAPI code. That is, since qapi is _already_ the mechanism for generating decent C code structs, and we want introspection to be self-describing, then your 'struct DataObject' from qapi-types.h (as generated by patch 1/5) should already be sufficient as the base of your array. Or maybe we make an array of yet one more layer of type: typedef struct QIntrospection QIntrospection; struct QIntrospection { DataObject data; const char *string; }; so that the C code has access to both the qapi struct and the pre-rendered string, and can thus get at whichever piece of information is needed (array[i].data.name when filtering by name, array[i].string when outputting pre-formatted text). What I find most appalling about the generated file is that each entry of the array is a JSON string, but that the JSON string does NOT match the JSON that would be sent over the wire when sending a DataObject. Thus, the only way your generated file is usable is if you run the string through a JSON parser, peel out the portions of the string you need, and then reconstruct things back into the QMP command. You REALLY want either a DataObject[] or a QIntrospection[], so that you DON'T have to parse strings in your C code. The python code should have already generated the header file with everything already placed in C structs and/or QMP wire format strings in the format that best suits your needs! That is, I think you want something more like this (rendering the first two lines that I quoted above in pseudo-C): const DataObject qmp_schema_table[] =3D { { .has_name =3D true, .name =3D "ErrorClass", .kind =3D DATA_OBJECT_KIND_ENUMERATION, .enumeration =3D { .value =3D "GenericError", .next =3D { .value =3D "CommandNotFound", .next =3D { .value =3D "DeviceEncrypted", .next =3D { .value =3D "DeviceNotActive", .next =3D { .value =3D "DeviceNotFound", .next =3D { .value =3D "KVMMissingCap", .next =3D NULL } } } } } }, .has_recursive =3D false, }, { .has_name =3D true, .name =3D "add_client", .kind =3D DATA_OBJECT_KIND_COMMAND, .command =3D { .has_data =3D true, .data =3D { .value =3D { .type =3D { .kind =3D DATA_OBJECT_MEMBER_TYPE_KIND_REFERENCE, .reference =3D "str" }, .has_name =3D true, .name =3D "protocol", .has_optional =3D false, .has_recursive =3D false }, .next =3D { .value =3D { .type =3D { .kind =3D DATA_OBJECT_MEMBER_TYPE_KIND_REFERENCE, .reference =3D "bool" }, .has_name =3D true, .name =3D "skipauth", .has_optional =3D true, .optional =3D true, .has_recursive =3D false }, ... .next =3D NULL }...}, .has_returns =3D false, .has_gen =3D false, }, .has_recursive =3D false, }, and so on. Of course, as I typed that, I realize that you can't actually initialize DataObjectCommand* and other object pointers via simple {} initializers; so you actually need to be more verbose and use some C99 typed initializers: =2E.. =2Eenumeration =3D &(DataObjectEnumeration) { .value =3D "GenericError", .next =3D &(DataObjectEnumeration) { .value =3D "CommandNotFound",= =2E.. Or maybe all you need is: const QIntrospection qmp_schema_table[] =3D { { .name =3D "ErrorClass", .str =3D "{\"name\":\"ErrorClass\"," "\"type\":\"enumeration\"," "\"data\":[" "\"GenericError\"," "\"CommandNotFound\"," "\"DeviceEncrypted\"," "\"DeviceNotFound\"," "\"KVMMissingCap\"" "]}" }, { .name =3D "add_client", .str =3D "{\"name\":\"add_client\"," "\"type\":\"command\"," "\"data\":{" ... with just the object name and pre-rendered string in each array entry. But my point remains - let the python code generate something USEFUL, and not something that the C code has to re-parse. If you're going to store a string, store it in the format that QMP will already hand over the wire. --=20 Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org --W1X9EQTH9Is5X5R849aIqvbVwSFJIxT5n Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 Comment: Public key at http://people.redhat.com/eblake/eblake.gpg Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBCAAGBQJS+XBvAAoJEKeha0olJ0NqW0EIAKwgsHNxW99H4ZCy6i948RJF omdOGbMETruYDCM7IwWuLj6V7F9hX8I5q9O21348o0EDoQumImmnIkofzpQcth3x S8f73CJXcdrE7ldZgnY4e9kLOqYPLWDHQmLsd7X0gWTHfO5m42oYhwLAQZS6JQnF ThOmgiUuj3Fu34mpfGBU3STN1mlBRAlFCvhPO5HA7SQqUvWJHiu8V+hvs+IYPJ4/ 4TJatsEtAx8cap2qMJACj6eTWUoShbC2EjH3iKtzr7ua72bYHTAVZyfhayQOyFNt Gb+9Uuj1Y3f6XhCV6bHagTGLod6TMuS+teoa8a4IWY5Po5zJrWPimeoAWpdFxSk= =agYI -----END PGP SIGNATURE----- --W1X9EQTH9Is5X5R849aIqvbVwSFJIxT5n--