From: "Michael S. Tsirkin" <mst@redhat.com>
To: Anthony Liguori <aliguori@us.ibm.com>
Cc: Peter Maydell <peter.maydell@linaro.org>,
kvm@vger.kernel.org, virtualization@lists.linux-foundation.org,
Stefan Hajnoczi <stefanha@redhat.com>,
Paolo Bonzini <pbonzini@redhat.com>,
KONRAD Frederic <fred.konrad@greensocs.com>
Subject: Re: [PATCH RFC] virtio-pci: new config layout: using memory BAR
Date: Wed, 5 Jun 2013 22:54:10 +0300 [thread overview]
Message-ID: <20130605195410.GA31143@redhat.com> (raw)
In-Reply-To: <87li6obd2r.fsf@codemonkey.ws>
On Wed, Jun 05, 2013 at 01:57:16PM -0500, Anthony Liguori wrote:
> "Michael S. Tsirkin" <mst@redhat.com> writes:
>
> > On Wed, Jun 05, 2013 at 10:46:15AM -0500, Anthony Liguori wrote:
> >> Look, it's very simple.
> > We only need to do it if we do a change that breaks guests.
> >
> > Please find a guest that is broken by the patches. You won't find any.
>
> I think the problem in this whole discussion is that we're talking past
> each other.
>
> Here is my understanding:
>
> 1) PCI-e says that you must be able to disable IO bars and still have a
> functioning device.
>
> 2) It says (1) because you must size IO bars to 4096 which means that
> practically speaking, once you enable a dozen or so PIO bars, you run
> out of PIO space (16 * 4k == 64k and not all that space can be used).
>
> virtio-pci uses a IO bars exclusively today. Existing guest drivers
> assume that there is an IO bar that contains the virtio-pci registers.
>
> So let's consider the following scenarios:
>
> QEMU of today:
>
> 1) qemu -drive file=ubuntu-13.04.img,if=virtio
>
> This works today. Does adding an MMIO bar at BAR1 break this?
> Certainly not if the device is behind a PCI bus...
>
> But are we going to put devices behind a PCI-e bus by default? Are we
> going to ask the user to choose whether devices are put behind a legacy
> bus or the express bus?
>
> What happens if we put the device behind a PCI-e bus by default? Well,
> it can still work. That is, until we do something like this:
>
> 2) qemu -drive file=ubuntu-13.04.img,if=virtio -device virtio-rng
> -device virtio-balloon..
>
> Such that we have more than a dozen or so devices. This works
> perfectly fine today. It works fine because we've designed virtio to
> make sure it works fine. Quoting the spec:
>
> "Configuration space is generally used for rarely-changing or
> initialization-time parameters. But it is a limited resource, so it
> might be better to use a virtqueue to update configuration information
> (the network device does this for filtering, otherwise the table in the
> config space could potentially be very large)."
>
> In fact, we can have 100s of PCI devices today without running out of IO
> space because we're so careful about this.
>
> So if we switch to using PCI-e by default *and* we keep virtio-pci
> without modifying the device IDs, then very frequently we are going to
> break existing guests because the drivers they already have no longer
> work.
>
> A few virtio-serial channels, a few block devices, a couple of network
> adapters, the balloon and RNG driver, and we hit the IO space limit
> pretty damn quickly so this is not a contrived scenario at all. I would
> expect that we frequently run into this if we don't address this problem.
>
> So we have a few options:
>
> 1) Punt all of this complexity to libvirt et al and watch people make
> the wrong decisions about when to use PCI-e. This will become yet
> another example of KVM being too hard to configure.
>
> 2) Enable PCI-e by default and just force people to upgrade their
> drivers.
>
> 3) Don't use PCI-e by default but still add BAR1 to virtio-pci
>
> 4) Do virtio-pcie, make it PCI-e friendly (drop the IO BAR completely), give
> it a new device/vendor ID. Continue to use virtio-pci for existing
> devices potentially adding virtio-{net,blk,...}-pcie variants for
> people that care to use them.
For the record, with respect to PCI-e discussion, I have no
problem with the idea of changing the device ID or
revision id and asking guests to upgrade if they
want to use a pcie device.
That's not exactly 4 however.
I see no reason to couple PCI-e with MMIO discussion,
that's just one of the reasons to support MMIO.
> I think 1 == 2 == 3 and I view 2 as an ABI breaker. libvirt does like
> policy so they're going to make a simple decision and always use the
> same bus by default. I suspect if we made PCI the default, they might
> just always set the PCI-e flag just because.
>
> There are hundreds of thousands if not millions of guests with existing
> virtio-pci drivers. Forcing them to upgrade better have an extremely
> good justification.
>
> I think 4 is the best path forward. It's better for users (guests
> continue to work as they always have). There's less confusion about
> enabling PCI-e support--you must ask for the virtio-pcie variant and you
> must have a virtio-pcie driver. It's easy to explain.
>
> It also maps to what regular hardware does. I highly doubt that there
> are any real PCI cards that made the shift from PCI to PCI-e without
> bumping at least a revision ID.
>
> It also means we don't need to play games about sometimes enabling IO
> bars and sometimes not.
>
> Regards,
>
> Anthony Liguori
>
> >
> >
> > --
> > MST
> > --
> > To unsubscribe from this list: send the line "unsubscribe kvm" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2013-06-05 19:54 UTC|newest]
Thread overview: 95+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-05-28 16:03 [PATCH RFC] virtio-pci: new config layout: using memory BAR Michael S. Tsirkin
2013-05-28 17:15 ` Anthony Liguori
[not found] ` <87bo7vvxej.fsf@codemonkey.ws>
2013-05-28 17:32 ` Michael S. Tsirkin
2013-05-28 17:43 ` Paolo Bonzini
2013-05-29 2:02 ` Laszlo Ersek
2013-05-29 4:33 ` Rusty Russell
[not found] ` <87mwremmm8.fsf@rustcorp.com.au>
2013-05-29 7:27 ` Paolo Bonzini
2013-05-29 8:05 ` Michael S. Tsirkin
2013-05-29 10:07 ` Laszlo Ersek
2013-05-28 18:53 ` Anthony Liguori
2013-05-28 19:27 ` Michael S. Tsirkin
2013-05-29 4:31 ` Rusty Russell
2013-05-29 8:24 ` Michael S. Tsirkin
2013-05-29 8:52 ` Paolo Bonzini
2013-05-29 9:00 ` Peter Maydell
2013-05-29 10:08 ` Michael S. Tsirkin
2013-05-29 10:53 ` Peter Maydell
2013-05-29 12:16 ` Michael S. Tsirkin
2013-05-29 12:28 ` Paolo Bonzini
2013-05-29 12:37 ` Michael S. Tsirkin
2013-05-29 12:52 ` Anthony Liguori
2013-05-29 13:24 ` Michael S. Tsirkin
2013-05-29 13:35 ` Peter Maydell
2013-05-29 13:41 ` Paolo Bonzini
2013-05-29 14:02 ` Michael S. Tsirkin
2013-05-29 14:18 ` Anthony Liguori
2013-05-30 7:43 ` Michael S. Tsirkin
2013-05-29 14:16 ` Anthony Liguori
[not found] ` <8761y1q3aw.fsf@codemonkey.ws>
2013-05-29 14:30 ` Michael S. Tsirkin
2013-05-29 14:32 ` Paolo Bonzini
2013-05-29 14:52 ` Michael S. Tsirkin
2013-05-29 14:55 ` Anthony Liguori
[not found] ` <87k3mhkf7o.fsf@codemonkey.ws>
2013-05-29 16:12 ` Michael S. Tsirkin
2013-05-29 18:16 ` Michael S. Tsirkin
2013-05-30 3:58 ` Rusty Russell
2013-05-30 5:55 ` Michael S. Tsirkin
2013-05-30 7:55 ` Michael S. Tsirkin
2013-06-03 0:17 ` Rusty Russell
2013-05-30 13:53 ` Anthony Liguori
2013-05-30 14:01 ` Michael S. Tsirkin
2013-06-03 0:26 ` Rusty Russell
2013-06-03 10:11 ` Michael S. Tsirkin
2013-06-04 5:31 ` Rusty Russell
2013-06-04 6:42 ` Michael S. Tsirkin
2013-06-05 7:19 ` Rusty Russell
2013-06-05 10:22 ` Michael S. Tsirkin
2013-06-05 12:59 ` Anthony Liguori
2013-06-05 14:09 ` Michael S. Tsirkin
2013-06-05 15:08 ` Anthony Liguori
2013-06-05 15:19 ` Michael S. Tsirkin
2013-06-05 15:46 ` Anthony Liguori
[not found] ` <87bo7ktvaw.fsf@codemonkey.ws>
2013-06-05 16:20 ` Michael S. Tsirkin
2013-06-05 18:57 ` Anthony Liguori
2013-06-05 19:43 ` Michael S. Tsirkin
2013-06-05 19:52 ` Michael S. Tsirkin
2013-06-05 20:45 ` Anthony Liguori
2013-06-05 21:15 ` H. Peter Anvin
2013-06-05 21:15 ` Michael S. Tsirkin
2013-06-05 20:42 ` Anthony Liguori
2013-06-05 21:14 ` Michael S. Tsirkin
2013-06-05 21:53 ` Anthony Liguori
[not found] ` <87d2s0mdh8.fsf@codemonkey.ws>
2013-06-05 22:19 ` Benjamin Herrenschmidt
2013-06-05 22:53 ` Anthony Liguori
2013-06-05 23:27 ` Benjamin Herrenschmidt
2013-06-05 19:54 ` Michael S. Tsirkin [this message]
2013-06-06 3:42 ` Rusty Russell
2013-06-06 14:59 ` Anthony Liguori
2013-06-07 1:58 ` Rusty Russell
2013-06-07 8:25 ` Peter Maydell
2013-06-05 21:10 ` H. Peter Anvin
2013-06-05 21:17 ` Michael S. Tsirkin
2013-06-05 21:50 ` Anthony Liguori
2013-06-05 21:55 ` H. Peter Anvin
2013-06-05 22:08 ` Anthony Liguori
2013-06-05 23:07 ` H. Peter Anvin
2013-06-06 0:41 ` Anthony Liguori
2013-06-06 6:34 ` Gleb Natapov
2013-06-06 13:53 ` H. Peter Anvin
2013-06-06 15:02 ` Anthony Liguori
2013-06-06 15:06 ` Gerd Hoffmann
2013-06-06 15:10 ` Gleb Natapov
2013-06-06 15:19 ` H. Peter Anvin
2013-06-06 15:22 ` Gerd Hoffmann
2013-07-08 4:25 ` Kevin O'Connor
[not found] ` <871u8fp9jd.fsf@codemonkey.ws>
2013-06-07 11:30 ` Gleb Natapov
2013-06-11 7:10 ` Michael S. Tsirkin
2013-06-11 7:53 ` Gleb Natapov
2013-06-11 8:02 ` Michael S. Tsirkin
2013-06-11 8:03 ` Gleb Natapov
2013-06-11 8:19 ` Michael S. Tsirkin
2013-06-11 8:22 ` Gleb Natapov
2013-06-11 8:30 ` Michael S. Tsirkin
2013-06-11 8:32 ` Gleb Natapov
2013-06-11 8:04 ` Michael S. Tsirkin
2013-06-06 8:02 ` Michael S. Tsirkin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130605195410.GA31143@redhat.com \
--to=mst@redhat.com \
--cc=aliguori@us.ibm.com \
--cc=fred.konrad@greensocs.com \
--cc=kvm@vger.kernel.org \
--cc=pbonzini@redhat.com \
--cc=peter.maydell@linaro.org \
--cc=stefanha@redhat.com \
--cc=virtualization@lists.linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).