Re: [Qemu-devel] [RFC] QCFG: a new mechanism to replace QemuOpts and option handling

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: Anthony Liguori <anthony@codemonkey.ws>
To: Markus Armbruster <armbru@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>, Chris Wright <chrisw@redhat.com>,
	qemu-devel <qemu-devel@nongnu.org>,
	Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>,
	Adam Litke <agl@us.ibm.com>
Subject: Re: [Qemu-devel] [RFC] QCFG: a new mechanism to replace QemuOpts and option handling
Date: Fri, 18 Mar 2011 17:39:51 -0500	[thread overview]
Message-ID: <4D83DF37.8060701@codemonkey.ws> (raw)
In-Reply-To: <m3ei64le3e.fsf@blackfin.pond.sub.org>

On 03/18/2011 09:04 AM, Markus Armbruster wrote:
>>>> Initial code is in my QAPI tree.
>>>>
>>>> I'm not going to start converting things until we get closer to the
>>>> end of 0.15 and QAPI is fully merged.  My plan is to focus on this for
>>>> 0.16 and do a full conversion for the 0.16 time frame using the same
>>>> approach as QAPI.  That means that for 0.16, we would be able to set
>>>> all command line options via QMP in a programmatic fashion with full
>>>> support for introspection.
>>>>
>>>> I'm haven't yet closed on how to bridge this to qdev.  qdev is a big
>>>> consumer of QemuOpts today.  I have some general ideas about what I'd
>>>> like to do but so far, I haven't written anything down.
>>> Glad you mention qdev.
>>>
>>> qdev properties describe the configurable parts of qdev state objects.
>>> A state object is a C struct.  The description is C data.  Poor man's
>>> reflection.
>> Right.  The problem is it's very hard to reflect C even if you parse
>> it without additional annotations.  For instance:
>>
>> typedef struct Foo {
>>      Bar *bar;
>> } Foo;
>>
>> What the type of bar is is ambigious.  It could be a pointer to a list
>> of Bar's (if bar has an embedded pointer), it could be an array of
>> Bars that is terminated using a field within Bar, it could be a
>> pointer to a fixed size array of Bars, or it could be just a pointer
>> to a single Bar object.
>>
>> So you end up needing additional annotations like:
>>
>> typedef struct Foo {
>>     size_t n_bar;
>>     Bar *bar sizeis(n_bar);
>> } Foo;
>>
>> This is what most IDLs that use C style syntax do.
> We currently use a more low-tech approach: define the struct in plain C,
> and the data describing the struct in plain C as well.
>
> Information about the type is in two places and in two formats (C type
> declaration and C data intializer).  There's a bit of redundancy.
> Ensuring consistency requires preprocessor hackery.

I've explored this to what I believe it's limit without crossing into 
extreme non-portability.

I think the best you can possibly do is a scheme like the following:

typedef struct MyStruct {
     int x;
     int y;
     MyOtherStruct *mo_list;
     MyStruct *next;
} MyStruct;

TypeInfo type_into_MyStruct = {
      .name = "MyStruct",
      .params = {
          DEFINE_PARAM(MyStruct, int, x),
          DEFINE_PARAM(MyStruct, int, y),
          DEFINE_LIST(MyStruct, MyOther, mo_list),
          DEFINE_NEXT(MyStruct, next),
          DEFINE_EOL(),
       },
};

But there is absolutely no type safety here.  You can confuse the type 
of mo_list as an int and you'll get no errors.

To get type safety, you need to have macros for each type.  However, 
this makes it very difficult to support things like lists of lists or 
anything else that would basically require a non-concrete type.

The basic problem here is that we're mixing how to transverse a data 
structure with how to describe a data structure.  If you separate those 
two things, it becomes cleaner:

void marshal_MyStruct(MyStruct *obj, const char *name)
{
      marshal_start_struct("MyStruct", name);
      marshal_int(&obj->x, "x");
      marshal_int(&obj->y, "y");
      marshal_start_list("mo_list");
      for (MyOtherStruct *i = obj->mo_list; i; i = i->next) {
          marshal_MyOtherStruct(i, "");
      }
      marshal_end_list();
      marshal_end_struct();
}

This generalizes very well to almost arbitrarily complicated data 
structures and is really almost as compact as the data structure 
version.  It also has very strong type safety.   But it's not good 
enough because C doesn't have exceptions so if a marshal error occurs, 
you need to propagate it.  So then you get:

void marshal_MyStruct(MyStruct *obj, const char *name, Error **errp)
{
      marshal_start_struct("MyStruct", name, errp);
      if (error_is_set(errp)) {
          return;
      }
      marshal_int(&obj->x, "x", errp);
       ...
}

Unwinding can be a bit challenging too without destructors assuming that 
some of the marshalling routines allocate state (they will for lists).

But this provides an extremely nice marshaller/unmarshaller.  You get 
high quality errors and you can support arbitrarily complex objects.

This is what qmp-gen.py creates for the qmp-marshal-types module.

But it creates quite a bit more.  Even if you write the above by hand 
(or use struct to describe it), these types are complex and cannot be 
free'd with a simple free call.  So you also need a function to free 
each type.

If you plan to expose these types in a library, you need to either 
explicitly pad each structure and make sure that the padding is updated 
correctly each time a new member is added.  Alternatively, you can add 
an allocation function that automatically pads each structure transparently.

qmp-gen.py creates qmp-types.[ch] to do exactly the above and also 
generates the type declaration so that you don't have to duplicate the 
type marshalling code and the type declaration.  Today, this is close to 
2k LOCs so it's actually a significant amount of code code.

There is also the code that takes the input (via QCFG or QMP) and calls 
an appropriate C function with a strongly typed argument.  I've played 
with libffi here to try to do this dynamically but you still lose strong 
typing because there's no (obvious) way to check the types of a function 
pointer against a dynamic type description at compile time.  I've tried 
to do some serious preprocessor-fu here to make it work but have failed.

Finally, on the libqmp side, you need to code that takes C arguments and 
calls the appropriate marshalling routines to build the variant types.  
This may not seem relevant with QCFG but as we make command line options 
configurable via QMP, it becomes important.

> The data doesn't have to describe all struct members.  I'm inclined to
> regard that as a feature.
>
> Avoids generating source code.

I don't think we really can and still preserve type safety.  All the 
techniques I see to do it without code generation also mean we still 
have to write a fair bit of code by hand.

We could actually avoid it if we were using C++ because templates 
(particularly with specialization) are pretty much what the code 
generator is making up for.  But yeah, I'm not going to even try to make 
that suggestion :-)

It's still hard to do a proper libqmp even with C++ FWIW.

>> There would be a separate interface for getting/setting properties.
>> It might even be something as simple as, you have to implement:
>>
>> int isa_serial_get_iobase(ISADeviceState *dev);
>> int isa_serial_get_irq(ISADeviceState *dev);
>> ...
> To be honest, I'm wary of generating lots of C source.  Prone to become
> a pain to debug and change.  Generated code tends to be much worse than
> data in that regard.
>
> If you generate significant parts of your program with other programs,
> chances are that you're not using the right language for the job.

Yeah, C really sucks when it comes to working with types in an abstract 
way.  But the real question is, would we rather have C++ with fancy 
template code or C with some code generation to make up for C's 
incredibly limited type system?

>> This ends up being a powerful interface because you can easily get
>> these properties within QEMU, but you also (transparently) map these
>> to the wire.  It also extends really nicely for setting properties
>> which becomes an interesting way to implement dynamic device logic
>> (like setting the link status of a network card).
> Well, the traditional way to "get properties" within QEMU is the ol' ->
> operator.
>
> I'm not sure I get the rest of the paragraph.

I mean:

(qemu) device_get virtio-pci.0 link_status
True
(qemu) device_set virtio-pci.0 link_status False
(qemu) device_get virtio-pci.0
link_status: False
mac-address: 00:12:32:43:54:92
vectors: 1
...

We already have read-only and construct-only properties.  A writable 
property is an obvious next step.

>>> Yet another one: vmstate, which describes migratable parts of qdev state
>>> objects.
>> Yes, for now, I'm ignoring vmstate.  To really solve vmstate, I think
>> you need to describe the complete object instead of just it's
>> properties.
> Would be nice to have a solution that can cover all our reflection
> needs, including vmstate.  But we need to watch out not to get bogged
> down in the Generality Swamps.

The mechanism I described using the visitor pattern is really the right 
solution for vmstate.  The hard problems to solve for vmstate are:

1) How to we support old versions in a robust way.  There are fancy 
things we could do once we have a proper visitor mechanism.  We could 
have special marshallers for old versions, we could treat the output of 
the visitor as an in memory tree and do XSLT style translations, etc.

2) How do we generate the visitor for each device.  I don't think it's 
practical to describe devices in JSON.  It certainly simplifies the 
problem but it seems ugly to me.  I think we realistically need a C 
style IDL and adopt a style of treating it as a header.

I think (1) is pretty straight forward but requires careful auditing of 
all of the weirdness we currently have.  (2) isn't so hard but we need 
to make sure the syntax is right from the start.

>>> Unlike these two, QCFG doesn't describe C structs, it generates them
>>> from JSON specs.  If I understand your proposal correctly.  Hmm.
>> Correct.
>>
>>> Can we avoid defining our data types in JSON rather than C?
>> I didn't start with describing them in JSON.  I started by describing
>> them in Python with the idea that I'd write a IDL parser using a C
>> style syntax that would then generate the Python structures.  Using
>> the fixed Python data structures, I could play with code generation
>> without all the trouble of writing a full blown IDL parser.
>>
>> But I never got around to writing that parser and have actually become
>> fond of the Python/JSON style syntax.
> Well, let's say "fondness" isn't what I'd expect ordinary people to feel
> when confonted with the idea to define half our structs in Python rather
> than C ;)

The thing I really like is that it limits you.  With a C based approach, 
there is almost an uncanny valley effect where there's an expectation 
that since my type looks mostly like a C structure, I should be able to 
do anything that I would normally do in C.  This is the real thing that 
I think frustrates most people with C style IDLs.

But with a JSON IDL, you approach it differently.  It's obviously not C 
and you start with the assumption that you can't do crazy things (like 
bitfields).

>>> Can we adopt *one* reflection mechanism?
>> Yes, I think we should start with that being the JSON syntax I'm using
>> for QAPI/QCFG and then down the road, we can introduce a totally new
>> syntax if need be.  But I think we need a central schema that
>> describes all externally visible interfaces.  I think this is really
>> the key idea here.
> Yes, self-describing external interfaces are desirable.
>
> Speaking of external interfaces: we need to be careful when exposing
> internal data types externally.  If we couple our external interfaces
> too closely to internals, we risk calcifying our internal interfaces.
>
> qdev has gotten that partly right: you can change the state structs
> pretty freely without affecting the externally visible properties.

Yeah, this is one of the big challenges with vmstate.  There needs to be 
a clearer line between internal types and object models vs. the 
externally visible interfaces.

Regards,

Anthony Liguori

> [...]
>

next prev parent reply	other threads:[~2011-03-18 22:40 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-03-14 17:48 [Qemu-devel] [RFC] QCFG: a new mechanism to replace QemuOpts and option handling Anthony Liguori
2011-03-14 19:52 ` Lluís
2011-03-14 20:04   ` Anthony Liguori
2011-03-15 10:09 ` Kevin Wolf
2011-03-15 13:27   ` Anthony Liguori
2011-03-15 13:45     ` Kevin Wolf
2011-03-15 13:56       ` Anthony Liguori
2011-03-18 18:12     ` Stefan Hajnoczi
2011-03-15 11:21 ` [Qemu-devel] " Kevin Wolf
2011-03-15 13:37   ` Anthony Liguori
2011-03-15 13:51     ` Kevin Wolf
2011-03-17 15:26       ` Markus Armbruster
2011-03-18  4:12         ` Anthony Liguori
2011-03-18 13:07           ` Markus Armbruster
2011-03-17 15:22 ` [Qemu-devel] " Markus Armbruster
2011-03-17 18:28   ` Anthony Liguori
2011-03-18  9:44     ` Kevin Wolf
2011-03-18 14:04     ` Markus Armbruster
2011-03-18 22:39       ` Anthony Liguori [this message]
2011-03-22 13:01         ` Markus Armbruster
2011-03-22 15:49           ` Anthony Liguori
2011-03-24  8:32             ` Markus Armbruster

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4D83DF37.8060701@codemonkey.ws \
    --to=anthony@codemonkey.ws \
    --cc=agl@us.ibm.com \
    --cc=armbru@redhat.com \
    --cc=chrisw@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).