All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Anthony Liguori <aliguori@us.ibm.com>
Cc: Peter Maydell <peter.maydell@linaro.org>,
	kvm@vger.kernel.org, virtualization@lists.linux-foundation.org,
	Stefan Hajnoczi <stefanha@redhat.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	KONRAD Frederic <fred.konrad@greensocs.com>
Subject: Re: [PATCH RFC] virtio-pci: new config layout: using memory BAR
Date: Wed, 29 May 2013 16:24:30 +0300	[thread overview]
Message-ID: <20130529132430.GA9363@redhat.com> (raw)
In-Reply-To: <87mwreq76y.fsf@codemonkey.ws>

On Wed, May 29, 2013 at 07:52:37AM -0500, Anthony Liguori wrote:
> Rusty Russell <rusty@rustcorp.com.au> writes:
> 
> > Anthony Liguori <aliguori@us.ibm.com> writes:
> >> "Michael S. Tsirkin" <mst@redhat.com> writes:
> >>> +    case offsetof(struct virtio_pci_common_cfg, device_feature_select):
> >>> +        return proxy->device_feature_select;
> >>
> >> Oh dear no...  Please use defines like the rest of QEMU.
> >
> > It is pretty ugly.
> 
> I think beauty is in the eye of the beholder here...
> 
> Pretty much every device we have has a switch statement like this.
> Consistency wins when it comes to qualitative arguments like this.
> 
> > Yet the structure definitions are descriptive, capturing layout, size
> > and endianness in natural a format readable by any C programmer.
> 
> >From an API design point of view, here are the problems I see:
> 
> 1) C makes no guarantees about structure layout beyond the first
>    member.  Yes, if it's naturally aligned or has a packed attribute,
>    GCC does the right thing.  But this isn't kernel land anymore,
>    portability matters and there are more compilers than GCC.

You expect a compiler to pad this structure:

struct foo {
	uint8_t a;
	uint8_t b;
	uint16_t c;
	uint32_t d;
};

I'm guessing any compiler that decides to waste memory in this way
will quickly get dropped by users and then we won't worry
about building QEMU with it.

> 2) If we every introduce anything like latching, this doesn't work out
>    so well anymore because it's hard to express in a single C structure
>    the register layout at that point.  Perhaps a union could be used but
>    padding may make it a bit challenging.

Then linux won't use it either.

> 3) It suspect it's harder to review because a subtle change could more
>    easily have broad impact.  If someone changed the type of a field
>    from u32 to u16, it changes the offset of every other field.  That's
>    not terribly obvious in the patch itself unless you understand how
>    the structure is used elsewhere.
> 
>    This may not be a problem for virtio because we all understand that
>    the structures are part of an ABI, but if we used this pattern more
>    in QEMU, it would be a lot less obvious.

So let's not use it more in QEMU.

> > So AFAICT the question is, do we put the required
> >
> > #define VIRTIO_PCI_CFG_FEATURE_SEL \
> >          (offsetof(struct virtio_pci_common_cfg, device_feature_select))
> >
> > etc. in the kernel headers or qemu?
> 
> I'm pretty sure we would end up just having our own integer defines.  We
> carry our own virtio headers today because we can't easily import the
> kernel headers.

Yes we can easily import them.
And at least we copy headers verbatim.


> >> Haven't looked at the proposed new ring layout yet.
> >
> > No change, but there's an open question on whether we should nail it to
> > little endian (or define the endian by the transport).
> >
> > Of course, I can't rule out that the 1.0 standard *may* decide to frob
> > the ring layout somehow,
> 
> Well, given that virtio is widely deployed today, I would think the 1.0
> standard should strictly reflect what's deployed today, no?
> Any new config layout would be 2.0 material, right?

Not as it's currently planned. Devices can choose
to support a legacy layout in addition to the new one,
and if you look at the patch you will see that that
is exactly what it does.

> Re: the new config layout, I don't think we would want to use it for
> anything but new devices.  Forcing a guest driver change

There's no forcing.
If you look at the patches closely, you will see that
we still support the old layout on BAR0.


> is a really big
> deal and I see no reason to do that unless there's a compelling reason
> to.

There are many a compelling reasons, and they are well known
limitations of virtio PCI:

- PCI spec compliance (madates device operation with IO memory disabled).
- support 64 bit addressing
- add more than 32 feature bits.
- individually disable queues.
- sanely support cross-endian systems.
- support very small (<1 PAGE) for virtio rings.
- support a separate page for each vq kick.
- make each device place config at flexible offset.

Addressing any one of these would cause us to add a substantially new
way to operate virtio devices.

And since it's a guest change anyway, it seemed like a
good time to do the new layout and fix everything in one go.

And they are needed like yesterday.


> So we're stuck with the 1.0 config layout for a very long time.
> 
> Regards,
> 
> Anthony Liguori

Absolutely. This patch let us support both which will allow for
a gradual transition over the next 10 years or so.

> > reason.  I suggest that's 2.0 material...
> >
> > Cheers,
> > Rusty.
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe kvm" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2013-05-29 13:24 UTC|newest]

Thread overview: 109+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-05-28 16:03 [PATCH RFC] virtio-pci: new config layout: using memory BAR Michael S. Tsirkin
2013-05-28 17:15 ` Anthony Liguori
2013-05-28 17:32   ` Michael S. Tsirkin
2013-05-28 17:43     ` Paolo Bonzini
2013-05-29  2:02       ` Laszlo Ersek
2013-05-29  2:02         ` Laszlo Ersek
2013-05-29  4:33       ` Rusty Russell
2013-05-29  4:33       ` Rusty Russell
2013-05-29  7:27         ` Paolo Bonzini
2013-05-29  8:05           ` Michael S. Tsirkin
2013-05-29 10:07           ` Laszlo Ersek
2013-05-28 18:53     ` Anthony Liguori
2013-05-28 19:27       ` Michael S. Tsirkin
2013-05-29  4:31   ` Rusty Russell
2013-05-29  4:31     ` Rusty Russell
2013-05-29  8:24     ` Michael S. Tsirkin
2013-05-29  8:52       ` Paolo Bonzini
2013-05-29  9:00       ` Peter Maydell
2013-05-29 10:08         ` Michael S. Tsirkin
2013-05-29 10:53           ` Peter Maydell
2013-05-29 12:16             ` Michael S. Tsirkin
2013-05-29 12:28               ` Paolo Bonzini
2013-05-29 12:37                 ` Michael S. Tsirkin
2013-05-29 12:52     ` Anthony Liguori
2013-05-29 12:52       ` Anthony Liguori
2013-05-29 13:24       ` Michael S. Tsirkin [this message]
2013-05-29 13:35         ` Peter Maydell
2013-05-29 13:35         ` Peter Maydell
2013-05-29 13:41         ` Paolo Bonzini
2013-05-29 14:02           ` Michael S. Tsirkin
2013-05-29 14:18           ` Anthony Liguori
2013-05-29 14:18           ` Anthony Liguori
2013-05-30  7:43           ` Michael S. Tsirkin
2013-05-29 14:16         ` Anthony Liguori
2013-05-29 14:16         ` Anthony Liguori
2013-05-29 14:30           ` Michael S. Tsirkin
2013-05-29 14:32             ` Paolo Bonzini
2013-05-29 14:52               ` Michael S. Tsirkin
2013-05-29 14:55             ` Anthony Liguori
2013-05-29 16:12               ` Michael S. Tsirkin
2013-05-29 18:16               ` Michael S. Tsirkin
2013-05-29 14:55             ` Anthony Liguori
2013-05-30  3:58       ` Rusty Russell
2013-05-30  3:58         ` Rusty Russell
2013-05-30  5:55         ` Michael S. Tsirkin
2013-05-30  7:55         ` Michael S. Tsirkin
2013-06-03  0:17           ` Rusty Russell
2013-05-30 13:53         ` Anthony Liguori
2013-05-30 13:53           ` Anthony Liguori
2013-05-30 14:01           ` Michael S. Tsirkin
2013-06-03  0:26             ` Rusty Russell
2013-06-03 10:11               ` Michael S. Tsirkin
2013-06-04  5:31                 ` Rusty Russell
2013-06-04  6:42                   ` Michael S. Tsirkin
2013-06-05  7:19                     ` Rusty Russell
2013-06-05 10:22                       ` Michael S. Tsirkin
2013-06-05 12:59                     ` Anthony Liguori
2013-06-05 14:09                       ` Michael S. Tsirkin
2013-06-05 15:08                         ` Anthony Liguori
2013-06-05 15:19                           ` Michael S. Tsirkin
2013-06-05 15:46                             ` Anthony Liguori
2013-06-05 15:46                             ` Anthony Liguori
2013-06-05 16:20                               ` Michael S. Tsirkin
2013-06-05 18:57                                 ` Anthony Liguori
2013-06-05 19:43                                   ` Michael S. Tsirkin
2013-06-05 19:52                                     ` Michael S. Tsirkin
2013-06-05 20:45                                       ` Anthony Liguori
2013-06-05 21:15                                         ` H. Peter Anvin
2013-06-05 21:15                                         ` Michael S. Tsirkin
2013-06-05 20:42                                     ` Anthony Liguori
2013-06-05 21:14                                       ` Michael S. Tsirkin
2013-06-05 21:53                                         ` Anthony Liguori
2013-06-05 21:53                                         ` Anthony Liguori
2013-06-05 22:19                                           ` Benjamin Herrenschmidt
2013-06-05 22:53                                             ` Anthony Liguori
2013-06-05 23:27                                               ` Benjamin Herrenschmidt
2013-06-05 19:54                                   ` Michael S. Tsirkin
2013-06-06  3:42                                   ` Rusty Russell
2013-06-06 14:59                                     ` Anthony Liguori
2013-06-07  1:58                                       ` Rusty Russell
2013-06-07  8:25                                       ` Peter Maydell
2013-06-05 21:10                                 ` H. Peter Anvin
2013-06-05 21:17                                   ` Michael S. Tsirkin
2013-06-05 21:50                                   ` Anthony Liguori
2013-06-05 21:55                                     ` H. Peter Anvin
2013-06-05 22:08                                       ` Anthony Liguori
2013-06-05 23:07                                         ` H. Peter Anvin
2013-06-06  0:41                                           ` Anthony Liguori
2013-06-06  6:34                                             ` Gleb Natapov
2013-06-06 13:53                                               ` H. Peter Anvin
2013-06-06 15:02                                               ` Anthony Liguori
2013-06-06 15:02                                               ` Anthony Liguori
2013-06-07 11:30                                                 ` Gleb Natapov
2013-06-11  7:10                                                 ` Michael S. Tsirkin
2013-06-11  7:53                                                   ` Gleb Natapov
2013-06-11  8:02                                                     ` Michael S. Tsirkin
2013-06-11  8:03                                                       ` Gleb Natapov
2013-06-11  8:19                                                         ` Michael S. Tsirkin
2013-06-11  8:22                                                           ` Gleb Natapov
2013-06-11  8:30                                                             ` Michael S. Tsirkin
2013-06-11  8:32                                                               ` Gleb Natapov
2013-06-11  8:04                                                       ` Michael S. Tsirkin
2013-06-06 15:06                                               ` Gerd Hoffmann
2013-06-06 15:10                                                 ` Gleb Natapov
2013-06-06 15:19                                                   ` H. Peter Anvin
2013-06-06 15:22                                                   ` Gerd Hoffmann
2013-07-08  4:25                                                   ` Kevin O'Connor
2013-06-06  8:02                   ` Michael S. Tsirkin
2013-05-28 17:15 ` Anthony Liguori

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130529132430.GA9363@redhat.com \
    --to=mst@redhat.com \
    --cc=aliguori@us.ibm.com \
    --cc=fred.konrad@greensocs.com \
    --cc=kvm@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=stefanha@redhat.com \
    --cc=virtualization@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.