From: Benjamin Herrenschmidt <benh@au1.ibm.com>
To: Andy Lutomirski <luto@amacapital.net>
Cc: "linux-s390@vger.kernel.org" <linux-s390@vger.kernel.org>,
Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>,
"Michael S. Tsirkin" <mst@redhat.com>,
Linux Virtualization <virtualization@lists.linux-foundation.org>,
Christian Borntraeger <borntraeger@de.ibm.com>,
Paolo Bonzini <pbonzini@redhat.com>,
"linux390@de.ibm.com" <linux390@de.ibm.com>
Subject: Re: [PATCH v4 0/4] virtio: Clean up scatterlists and use the DMA API
Date: Wed, 03 Sep 2014 10:25:42 +1000 [thread overview]
Message-ID: <1409703942.30640.71.camel@pasglop> (raw)
In-Reply-To: <CALCETrWTwoBLj6onvNWOrwvY_n3-rWx0KifR0xP7091Vd-xHbg@mail.gmail.com>
On Tue, 2014-09-02 at 16:42 -0700, Andy Lutomirski wrote:
> But there aren't any ACPI systems with both virtio-pci and IOMMUs,
> right? So we could say that, henceforth, ACPI systems must declare
> whether virtio-pci devices live behind IOMMUs without breaking
> backward compatibility.
I don't know for sure whether that's the case and whether we can rely on
that not happening, we'll need x86 folks opinion here.
> >> On ARM, I hope the QEMU will never implement a PCI IOMMU. As far as I
> >> could tell when I looked last week, none of the newer QEMU-emulated
> >> ARM machines even support PCI. Even if QEMU were to implement a PCI
> >> IOMMU on some future ARM machine, it could continue using virtio-mmio
> >> for virtio devices.
> >
> > Possibly...
> >
> >> So ppc might actually be the only system that has or will have
> >> physically-addressed virtio PCI devices that are behind an IOMMU. Can
> >> this be handled in a ppc64-specific way?
> >
> > I wouldn't be so certain, as I said, the way virtio is implemented in
> > qemu bypass the DMA layer which is where IOMMUs sit. The fact that
> > currently x86 doesn't put an IOMMU there is not even garanteed, is it ?
> > What happens if you try to mix and match virtio and other emulated
> > devices that require the iommu on the same bus ?
>
> AFAIK QEMU doesn't support IOMMUs at all on x86, so current versions
> of QEMU really do guarantee that virtio-pci on x86 has no IOMMU, even
> if that guarantee is purely accidental.
Right.
> > If we could discriminate virtio devices to a specific host bridge and
> > guarantee no mix & match, we could probably add a concept of
> > "IOMMU-less" bus but that would require guest changes which limits the
> > usefulness.
> >
> >> Is there any way that the
> >> kernel can distinguish a QEMU-provided virtio PCI device from a
> >> physical PCIe thing?
> >
> > Not with existing guests which cannot be changed. Existing distros are
> > out with those drivers. If we add a backward compatibility mechanism,
> > then we could add something yes, provided we can segregate virtio onto a
> > dedicated host bridge (which can be a problem with the libvirt
> > trainwreck...)
>
> Ugh.
>
> So here's an ugly proposal:
>
> Step 1: Make virtio-pci use the DMA API only on x86. This will at
> least fix Xen and people experimenting with virtio hardware on x86,
> and it won't break anything, since there are no emulated IOMMUs on
> x86.
I think we should make all virtio drivers use the DMA API and just have
different set of dma_ops. We can make a simple ifdef powerpc if needed
in virtio-pci that force the dma-ops of the device to some direct
"bypass" ops at init time.
That way no need to select whether to use the DMA API or not, just
always use it, and add a tweak to replace the DMA ops with the direct
ones on the archs/platforms that need that. That was my original
proposal and I still think it's the best approach.
> Step 2: Update the virtio spec. Virtio 1.0 PCI devices should set a
> new bit if they are physically addressed. If that bit is clear, then
> the device is assumed to be addressed in accordance with the
> platform's standard addressing model for PCI. Presumably this would
> be something like VIRTIO_F_BUS_ADDRESSING = 33, and the spec would say
> something like "Physical devices compatible with this specification
> MUST offer VIRTIO_F_BUS_ADDRESSING. Drivers MUST implement this
> feature." Alternatively, this could live in a PCI configuration
> capability.
I'll let you sort that out with Rusty but it makes sense.
> Step 3: Update virtio-pci to use the DMA API for all devices on x86
> and for devices that advertise bus addressing on other architectures.
>
> I think this proposal will work, but I also think it sucks and I'd
> really like to see a better counter-proposal.
As I said, make it always use the DMA API, but add a quirk to replace
the dma_ops with some NULL ops on platforms that need it.
The only issue with that is the location of the dma ops is arch
specific, so that one function will contain some ifdefs, but the rest of
the code can just use the DMA API.
Cheers,
Ben.
next prev parent reply other threads:[~2014-09-03 0:25 UTC|newest]
Thread overview: 76+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-09-01 17:39 [PATCH v4 0/4] virtio: Clean up scatterlists and use the DMA API Andy Lutomirski
2014-09-01 17:39 ` [PATCH v4 1/4] virtio_ring: Support DMA APIs if requested Andy Lutomirski
2014-09-01 17:39 ` [PATCH v4 2/4] virtio_pci: Use the DMA API for virtqueues Andy Lutomirski
2014-09-01 17:39 ` [PATCH v4 3/4] virtio_net: Don't set the end flag on reusable sg entries Andy Lutomirski
2014-09-01 17:39 ` [PATCH v4 4/4] virtio_net: Stop doing DMA from the stack Andy Lutomirski
2014-09-01 22:16 ` [PATCH v4 0/4] virtio: Clean up scatterlists and use the DMA API Benjamin Herrenschmidt
2014-09-02 5:55 ` Andy Lutomirski
2014-09-02 20:53 ` Benjamin Herrenschmidt
2014-09-02 20:56 ` Konrad Rzeszutek Wilk
2014-09-02 21:08 ` Benjamin Herrenschmidt
2014-09-02 21:37 ` Andy Lutomirski
2014-09-02 22:10 ` Benjamin Herrenschmidt
2014-09-02 23:11 ` Andy Lutomirski
2014-09-02 23:20 ` Benjamin Herrenschmidt
2014-09-02 23:42 ` Andy Lutomirski
2014-09-03 0:25 ` Benjamin Herrenschmidt [this message]
2014-09-03 0:32 ` Andy Lutomirski
2014-09-03 0:43 ` Benjamin Herrenschmidt
2014-09-04 2:03 ` Andy Lutomirski
2014-09-03 7:47 ` Paolo Bonzini
2014-09-03 7:52 ` Andy Lutomirski
2014-09-03 8:01 ` Paolo Bonzini
2014-09-03 8:05 ` Benjamin Herrenschmidt
2014-09-03 12:11 ` Paolo Bonzini
2014-09-03 15:07 ` Andy Lutomirski
2014-09-03 15:11 ` Paolo Bonzini
2014-09-03 16:39 ` Michael S. Tsirkin
2014-09-03 20:38 ` Andy Lutomirski
2014-09-03 7:43 ` Paolo Bonzini
2014-09-03 6:42 ` Rusty Russell
2014-09-03 7:50 ` Andy Lutomirski
2014-09-05 2:31 ` Rusty Russell
2014-09-05 2:57 ` Andy Lutomirski
2014-09-05 5:20 ` Benjamin Herrenschmidt
2014-09-05 7:33 ` Christian Borntraeger
2014-09-10 15:36 ` Christopher Covington
2014-09-10 16:15 ` Andy Lutomirski
2014-09-05 5:16 ` Benjamin Herrenschmidt
2014-09-14 8:58 ` Michael S. Tsirkin
2014-09-03 12:51 ` Michael S. Tsirkin
2014-09-05 2:32 ` Rusty Russell
2014-09-05 3:06 ` Andy Lutomirski
2014-09-02 21:10 ` Michael S. Tsirkin
2014-09-02 21:49 ` Andy Lutomirski
-- strict thread matches above, loose matches on Subject: below --
2015-07-28 1:08 Andy Lutomirski
2015-07-28 7:05 ` Christian Borntraeger
2015-07-28 8:16 ` Paolo Bonzini
2015-07-28 10:12 ` Benjamin Herrenschmidt
2015-07-28 12:46 ` Paolo Bonzini
2015-07-28 13:06 ` Michael S. Tsirkin
2015-07-28 13:11 ` Jan Kiszka
2015-07-28 16:11 ` Andy Lutomirski
2015-07-28 16:44 ` Jan Kiszka
2015-07-28 17:10 ` Andy Lutomirski
2015-07-28 17:17 ` Jan Kiszka
2015-07-28 18:22 ` Andy Lutomirski
2015-07-28 19:06 ` Jan Kiszka
2015-07-28 19:24 ` Andy Lutomirski
2015-07-28 19:33 ` Jan Kiszka
2015-07-28 21:16 ` Andy Lutomirski
2015-07-28 22:43 ` Andy Lutomirski
2015-07-28 23:21 ` Benjamin Herrenschmidt
2015-07-28 23:33 ` Andy Lutomirski
2015-07-29 0:36 ` Benjamin Herrenschmidt
2015-07-29 0:47 ` Andy Lutomirski
2015-07-29 0:54 ` Benjamin Herrenschmidt
2015-07-29 8:17 ` Paolo Bonzini
2015-07-29 8:20 ` Jan Kiszka
2015-07-29 9:21 ` Benjamin Herrenschmidt
2015-07-29 8:07 ` Jan Kiszka
2015-07-28 16:36 ` Paolo Bonzini
2015-07-28 16:42 ` Jan Kiszka
2015-07-28 17:15 ` Paolo Bonzini
2015-07-28 17:19 ` Jan Kiszka
2015-07-28 17:31 ` Paolo Bonzini
2015-07-28 13:08 ` Michael S. Tsirkin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1409703942.30640.71.camel@pasglop \
--to=benh@au1.ibm.com \
--cc=borntraeger@de.ibm.com \
--cc=konrad.wilk@oracle.com \
--cc=linux-s390@vger.kernel.org \
--cc=linux390@de.ibm.com \
--cc=luto@amacapital.net \
--cc=mst@redhat.com \
--cc=pbonzini@redhat.com \
--cc=virtualization@lists.linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox