From: "Michael S. Tsirkin" <mst@redhat.com>
To: Anthony Liguori <aliguori@us.ibm.com>
Cc: Peter Maydell <peter.maydell@linaro.org>,
kvm@vger.kernel.org, virtualization@lists.linux-foundation.org,
Stefan Hajnoczi <stefanha@redhat.com>,
Paolo Bonzini <pbonzini@redhat.com>,
KONRAD Frederic <fred.konrad@greensocs.com>
Subject: Re: [PATCH RFC] virtio-pci: new config layout: using memory BAR
Date: Wed, 29 May 2013 16:24:30 +0300 [thread overview]
Message-ID: <20130529132430.GA9363@redhat.com> (raw)
In-Reply-To: <87mwreq76y.fsf@codemonkey.ws>
On Wed, May 29, 2013 at 07:52:37AM -0500, Anthony Liguori wrote:
> Rusty Russell <rusty@rustcorp.com.au> writes:
>
> > Anthony Liguori <aliguori@us.ibm.com> writes:
> >> "Michael S. Tsirkin" <mst@redhat.com> writes:
> >>> + case offsetof(struct virtio_pci_common_cfg, device_feature_select):
> >>> + return proxy->device_feature_select;
> >>
> >> Oh dear no... Please use defines like the rest of QEMU.
> >
> > It is pretty ugly.
>
> I think beauty is in the eye of the beholder here...
>
> Pretty much every device we have has a switch statement like this.
> Consistency wins when it comes to qualitative arguments like this.
>
> > Yet the structure definitions are descriptive, capturing layout, size
> > and endianness in natural a format readable by any C programmer.
>
> >From an API design point of view, here are the problems I see:
>
> 1) C makes no guarantees about structure layout beyond the first
> member. Yes, if it's naturally aligned or has a packed attribute,
> GCC does the right thing. But this isn't kernel land anymore,
> portability matters and there are more compilers than GCC.
You expect a compiler to pad this structure:
struct foo {
uint8_t a;
uint8_t b;
uint16_t c;
uint32_t d;
};
I'm guessing any compiler that decides to waste memory in this way
will quickly get dropped by users and then we won't worry
about building QEMU with it.
> 2) If we every introduce anything like latching, this doesn't work out
> so well anymore because it's hard to express in a single C structure
> the register layout at that point. Perhaps a union could be used but
> padding may make it a bit challenging.
Then linux won't use it either.
> 3) It suspect it's harder to review because a subtle change could more
> easily have broad impact. If someone changed the type of a field
> from u32 to u16, it changes the offset of every other field. That's
> not terribly obvious in the patch itself unless you understand how
> the structure is used elsewhere.
>
> This may not be a problem for virtio because we all understand that
> the structures are part of an ABI, but if we used this pattern more
> in QEMU, it would be a lot less obvious.
So let's not use it more in QEMU.
> > So AFAICT the question is, do we put the required
> >
> > #define VIRTIO_PCI_CFG_FEATURE_SEL \
> > (offsetof(struct virtio_pci_common_cfg, device_feature_select))
> >
> > etc. in the kernel headers or qemu?
>
> I'm pretty sure we would end up just having our own integer defines. We
> carry our own virtio headers today because we can't easily import the
> kernel headers.
Yes we can easily import them.
And at least we copy headers verbatim.
> >> Haven't looked at the proposed new ring layout yet.
> >
> > No change, but there's an open question on whether we should nail it to
> > little endian (or define the endian by the transport).
> >
> > Of course, I can't rule out that the 1.0 standard *may* decide to frob
> > the ring layout somehow,
>
> Well, given that virtio is widely deployed today, I would think the 1.0
> standard should strictly reflect what's deployed today, no?
> Any new config layout would be 2.0 material, right?
Not as it's currently planned. Devices can choose
to support a legacy layout in addition to the new one,
and if you look at the patch you will see that that
is exactly what it does.
> Re: the new config layout, I don't think we would want to use it for
> anything but new devices. Forcing a guest driver change
There's no forcing.
If you look at the patches closely, you will see that
we still support the old layout on BAR0.
> is a really big
> deal and I see no reason to do that unless there's a compelling reason
> to.
There are many a compelling reasons, and they are well known
limitations of virtio PCI:
- PCI spec compliance (madates device operation with IO memory disabled).
- support 64 bit addressing
- add more than 32 feature bits.
- individually disable queues.
- sanely support cross-endian systems.
- support very small (<1 PAGE) for virtio rings.
- support a separate page for each vq kick.
- make each device place config at flexible offset.
Addressing any one of these would cause us to add a substantially new
way to operate virtio devices.
And since it's a guest change anyway, it seemed like a
good time to do the new layout and fix everything in one go.
And they are needed like yesterday.
> So we're stuck with the 1.0 config layout for a very long time.
>
> Regards,
>
> Anthony Liguori
Absolutely. This patch let us support both which will allow for
a gradual transition over the next 10 years or so.
> > reason. I suggest that's 2.0 material...
> >
> > Cheers,
> > Rusty.
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe kvm" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2013-05-29 13:24 UTC|newest]
Thread overview: 95+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-05-28 16:03 [PATCH RFC] virtio-pci: new config layout: using memory BAR Michael S. Tsirkin
2013-05-28 17:15 ` Anthony Liguori
[not found] ` <87bo7vvxej.fsf@codemonkey.ws>
2013-05-28 17:32 ` Michael S. Tsirkin
2013-05-28 17:43 ` Paolo Bonzini
2013-05-29 2:02 ` Laszlo Ersek
2013-05-29 4:33 ` Rusty Russell
[not found] ` <87mwremmm8.fsf@rustcorp.com.au>
2013-05-29 7:27 ` Paolo Bonzini
2013-05-29 8:05 ` Michael S. Tsirkin
2013-05-29 10:07 ` Laszlo Ersek
2013-05-28 18:53 ` Anthony Liguori
2013-05-28 19:27 ` Michael S. Tsirkin
2013-05-29 4:31 ` Rusty Russell
2013-05-29 8:24 ` Michael S. Tsirkin
2013-05-29 8:52 ` Paolo Bonzini
2013-05-29 9:00 ` Peter Maydell
2013-05-29 10:08 ` Michael S. Tsirkin
2013-05-29 10:53 ` Peter Maydell
2013-05-29 12:16 ` Michael S. Tsirkin
2013-05-29 12:28 ` Paolo Bonzini
2013-05-29 12:37 ` Michael S. Tsirkin
2013-05-29 12:52 ` Anthony Liguori
2013-05-29 13:24 ` Michael S. Tsirkin [this message]
2013-05-29 13:35 ` Peter Maydell
2013-05-29 13:41 ` Paolo Bonzini
2013-05-29 14:02 ` Michael S. Tsirkin
2013-05-29 14:18 ` Anthony Liguori
2013-05-30 7:43 ` Michael S. Tsirkin
2013-05-29 14:16 ` Anthony Liguori
[not found] ` <8761y1q3aw.fsf@codemonkey.ws>
2013-05-29 14:30 ` Michael S. Tsirkin
2013-05-29 14:32 ` Paolo Bonzini
2013-05-29 14:52 ` Michael S. Tsirkin
2013-05-29 14:55 ` Anthony Liguori
[not found] ` <87k3mhkf7o.fsf@codemonkey.ws>
2013-05-29 16:12 ` Michael S. Tsirkin
2013-05-29 18:16 ` Michael S. Tsirkin
2013-05-30 3:58 ` Rusty Russell
2013-05-30 5:55 ` Michael S. Tsirkin
2013-05-30 7:55 ` Michael S. Tsirkin
2013-06-03 0:17 ` Rusty Russell
2013-05-30 13:53 ` Anthony Liguori
2013-05-30 14:01 ` Michael S. Tsirkin
2013-06-03 0:26 ` Rusty Russell
2013-06-03 10:11 ` Michael S. Tsirkin
2013-06-04 5:31 ` Rusty Russell
2013-06-04 6:42 ` Michael S. Tsirkin
2013-06-05 7:19 ` Rusty Russell
2013-06-05 10:22 ` Michael S. Tsirkin
2013-06-05 12:59 ` Anthony Liguori
2013-06-05 14:09 ` Michael S. Tsirkin
2013-06-05 15:08 ` Anthony Liguori
2013-06-05 15:19 ` Michael S. Tsirkin
2013-06-05 15:46 ` Anthony Liguori
[not found] ` <87bo7ktvaw.fsf@codemonkey.ws>
2013-06-05 16:20 ` Michael S. Tsirkin
2013-06-05 18:57 ` Anthony Liguori
2013-06-05 19:43 ` Michael S. Tsirkin
2013-06-05 19:52 ` Michael S. Tsirkin
2013-06-05 20:45 ` Anthony Liguori
2013-06-05 21:15 ` H. Peter Anvin
2013-06-05 21:15 ` Michael S. Tsirkin
2013-06-05 20:42 ` Anthony Liguori
2013-06-05 21:14 ` Michael S. Tsirkin
2013-06-05 21:53 ` Anthony Liguori
[not found] ` <87d2s0mdh8.fsf@codemonkey.ws>
2013-06-05 22:19 ` Benjamin Herrenschmidt
2013-06-05 22:53 ` Anthony Liguori
2013-06-05 23:27 ` Benjamin Herrenschmidt
2013-06-05 19:54 ` Michael S. Tsirkin
2013-06-06 3:42 ` Rusty Russell
2013-06-06 14:59 ` Anthony Liguori
2013-06-07 1:58 ` Rusty Russell
2013-06-07 8:25 ` Peter Maydell
2013-06-05 21:10 ` H. Peter Anvin
2013-06-05 21:17 ` Michael S. Tsirkin
2013-06-05 21:50 ` Anthony Liguori
2013-06-05 21:55 ` H. Peter Anvin
2013-06-05 22:08 ` Anthony Liguori
2013-06-05 23:07 ` H. Peter Anvin
2013-06-06 0:41 ` Anthony Liguori
2013-06-06 6:34 ` Gleb Natapov
2013-06-06 13:53 ` H. Peter Anvin
2013-06-06 15:02 ` Anthony Liguori
2013-06-06 15:06 ` Gerd Hoffmann
2013-06-06 15:10 ` Gleb Natapov
2013-06-06 15:19 ` H. Peter Anvin
2013-06-06 15:22 ` Gerd Hoffmann
2013-07-08 4:25 ` Kevin O'Connor
[not found] ` <871u8fp9jd.fsf@codemonkey.ws>
2013-06-07 11:30 ` Gleb Natapov
2013-06-11 7:10 ` Michael S. Tsirkin
2013-06-11 7:53 ` Gleb Natapov
2013-06-11 8:02 ` Michael S. Tsirkin
2013-06-11 8:03 ` Gleb Natapov
2013-06-11 8:19 ` Michael S. Tsirkin
2013-06-11 8:22 ` Gleb Natapov
2013-06-11 8:30 ` Michael S. Tsirkin
2013-06-11 8:32 ` Gleb Natapov
2013-06-11 8:04 ` Michael S. Tsirkin
2013-06-06 8:02 ` Michael S. Tsirkin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130529132430.GA9363@redhat.com \
--to=mst@redhat.com \
--cc=aliguori@us.ibm.com \
--cc=fred.konrad@greensocs.com \
--cc=kvm@vger.kernel.org \
--cc=pbonzini@redhat.com \
--cc=peter.maydell@linaro.org \
--cc=stefanha@redhat.com \
--cc=virtualization@lists.linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).