From: "Michael S. Tsirkin" <mst@redhat.com>
To: Andy Lutomirski <luto@amacapital.net>
Cc: "linux-s390@vger.kernel.org" <linux-s390@vger.kernel.org>,
Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>,
Benjamin Herrenschmidt <benh@kernel.crashing.org>,
Linux Virtualization <virtualization@lists.linux-foundation.org>,
Christian Borntraeger <borntraeger@de.ibm.com>,
Paolo Bonzini <pbonzini@redhat.com>,
"linux390@de.ibm.com" <linux390@de.ibm.com>
Subject: Re: [PATCH v4 0/4] virtio: Clean up scatterlists and use the DMA API
Date: Wed, 3 Sep 2014 00:10:23 +0300 [thread overview]
Message-ID: <20140902211023.GB25231@redhat.com> (raw)
In-Reply-To: <CALCETrVHSjaCe5TN6+Gr9W9uT8XEZC97ne_dZUazMDyLr0Wetw@mail.gmail.com>
On Mon, Sep 01, 2014 at 10:55:29PM -0700, Andy Lutomirski wrote:
> On Mon, Sep 1, 2014 at 3:16 PM, Benjamin Herrenschmidt
> <benh@kernel.crashing.org> wrote:
> > On Mon, 2014-09-01 at 10:39 -0700, Andy Lutomirski wrote:
> >> Changes from v1:
> >> - Using the DMA API is optional now. It would be nice to improve the
> >> DMA API to the point that it could be used unconditionally, but s390
> >> proves that we're not there yet.
> >> - Includes patch 4, which fixes DMA debugging warnings from virtio_net.
> >
> > I'm not sure if you saw my reply on the other thread but I have a few
> > comments based on the above "it would be nice if ..."
> >
>
> Yeah, sorry, I sort of thought I responded, but I didn't do a very good job.
>
> > So here we have both a yes and a no :-)
> >
> > It would be nice to avoid those if () games all over and indeed just
> > use the DMA API, *however* we most certainly don't want to actually
> > create IOMMU mappings for the KVM virio case. This would be a massive
> > loss in performances on several platforms and generally doesn't make
> > much sense.
> >
> > However, we can still use the API without that on any architecture
> > where the dma mapping API ends up calling the generic dma_map_ops,
> > it becomes just a matter of virtio setting up some special "nop" ops
> > when needed.
>
> I'm not quite convinced that this is a good idea. I think that there
> are three relevant categories of virtio devices:
>
> a) Any virtio device where the normal DMA ops are nops. This includes
> x86 without an IOMMU (e.g. in a QEMU/KVM guest), 32-bit ARM, and
> probably many other architectures. In this case, what we do only
> matters for performance, not for correctness. Ideally the arch DMA
> ops are fast.
>
> b) Virtio devices that use physical addressing on systems where DMA
> ops either don't exist at all (most s390) or do something nontrivial.
> In this case, we must either override the DMA ops or just not use
> them.
>
> c) Virtio devices that use bus addressing. This includes everything
> on Xen (because the "physical" addresses are nonsense) and any actual
> physical PCI device that speaks virtio on a system with an IOMMU. In
> this case, we must use the DMA ops.
>
> The issue is that, on systems with DMA ops that do something, we need
> to make sure that we know whether we're in case (b) or (c). In these
> patches, I've made the assumption that, if the virtio devices lives on
> the PCI bus, then it uses the same type of addressing that any other
> device on that PCI bus would use.
>
> On x86, at least, I doubt that we'll ever see a physically addressed
> PCI virtio device for which ACPI advertises an IOMMU, since any sane
> hypervisor will just not advertise an IOMMU for the virtio device.
How exactly does one not advertise an IOMMU for a specific
device? Could you please clarify?
> But are there arm64 or PPC guests that use virtio_pci, that have
> IOMMUs, and that will malfunction if the virtio_pci driver ends up
> using the IOMMU? I certainly hope not, since these systems might be
> very hard-pressed to work right if someone plugged in a physical
> virtio-speaking PCI device.
One simple fix is to defer this all until virtio 1.0.
virtio 1.0 has an alternative set of IDs for virtio pci,
that can be used if you are making an incompatible change.
We can use that if there's an iommu.
> >
> > The difficulty here resides in the fact that we have never completely
> > made the dma_map_ops generic. The ops themselves are defined generically
> > as are the dma_map_* interfaces based on them, but the location of the
> > ops pointer is still more/less arch specific and some architectures
> > still chose not to use that indirection at all I believe.
> >
>
> I'd be happy to update the patches if someone does this, but I don't
> really want to attack the DMA API on all architectures right now. In
> the mean time, at least s390 requires that we be able to compile out
> the DMA API calls. I'd rather see s390 provide working no-op dma ops
> for all of the struct devices that provide virtio interfaces.
>
> On a related note, shouldn't virtio be doing something to provide dma
> ops to the virtio device and any of its children? I don't know how it
> would even try to do this, given how architecture-dependent this code
> currently is. Calling dma_map_single on the virtio device (as opposed
> to its parent) is currently likely to crash on x86. Fortunately,
> nothing does this.
>
> --Andy
next prev parent reply other threads:[~2014-09-02 21:10 UTC|newest]
Thread overview: 76+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-09-01 17:39 [PATCH v4 0/4] virtio: Clean up scatterlists and use the DMA API Andy Lutomirski
2014-09-01 17:39 ` [PATCH v4 1/4] virtio_ring: Support DMA APIs if requested Andy Lutomirski
2014-09-01 17:39 ` [PATCH v4 2/4] virtio_pci: Use the DMA API for virtqueues Andy Lutomirski
2014-09-01 17:39 ` [PATCH v4 3/4] virtio_net: Don't set the end flag on reusable sg entries Andy Lutomirski
2014-09-01 17:39 ` [PATCH v4 4/4] virtio_net: Stop doing DMA from the stack Andy Lutomirski
2014-09-01 22:16 ` [PATCH v4 0/4] virtio: Clean up scatterlists and use the DMA API Benjamin Herrenschmidt
2014-09-02 5:55 ` Andy Lutomirski
2014-09-02 20:53 ` Benjamin Herrenschmidt
2014-09-02 20:56 ` Konrad Rzeszutek Wilk
2014-09-02 21:08 ` Benjamin Herrenschmidt
2014-09-02 21:37 ` Andy Lutomirski
2014-09-02 22:10 ` Benjamin Herrenschmidt
2014-09-02 23:11 ` Andy Lutomirski
2014-09-02 23:20 ` Benjamin Herrenschmidt
2014-09-02 23:42 ` Andy Lutomirski
2014-09-03 0:25 ` Benjamin Herrenschmidt
2014-09-03 0:32 ` Andy Lutomirski
2014-09-03 0:43 ` Benjamin Herrenschmidt
2014-09-04 2:03 ` Andy Lutomirski
2014-09-03 7:47 ` Paolo Bonzini
2014-09-03 7:52 ` Andy Lutomirski
2014-09-03 8:01 ` Paolo Bonzini
2014-09-03 8:05 ` Benjamin Herrenschmidt
2014-09-03 12:11 ` Paolo Bonzini
2014-09-03 15:07 ` Andy Lutomirski
2014-09-03 15:11 ` Paolo Bonzini
2014-09-03 16:39 ` Michael S. Tsirkin
2014-09-03 20:38 ` Andy Lutomirski
2014-09-03 7:43 ` Paolo Bonzini
2014-09-03 6:42 ` Rusty Russell
2014-09-03 7:50 ` Andy Lutomirski
2014-09-05 2:31 ` Rusty Russell
2014-09-05 2:57 ` Andy Lutomirski
2014-09-05 5:20 ` Benjamin Herrenschmidt
2014-09-05 7:33 ` Christian Borntraeger
2014-09-10 15:36 ` Christopher Covington
2014-09-10 16:15 ` Andy Lutomirski
2014-09-05 5:16 ` Benjamin Herrenschmidt
2014-09-14 8:58 ` Michael S. Tsirkin
2014-09-03 12:51 ` Michael S. Tsirkin
2014-09-05 2:32 ` Rusty Russell
2014-09-05 3:06 ` Andy Lutomirski
2014-09-02 21:10 ` Michael S. Tsirkin [this message]
2014-09-02 21:49 ` Andy Lutomirski
-- strict thread matches above, loose matches on Subject: below --
2015-07-28 1:08 Andy Lutomirski
2015-07-28 7:05 ` Christian Borntraeger
2015-07-28 8:16 ` Paolo Bonzini
2015-07-28 10:12 ` Benjamin Herrenschmidt
2015-07-28 12:46 ` Paolo Bonzini
2015-07-28 13:06 ` Michael S. Tsirkin
2015-07-28 13:11 ` Jan Kiszka
2015-07-28 16:11 ` Andy Lutomirski
2015-07-28 16:44 ` Jan Kiszka
2015-07-28 17:10 ` Andy Lutomirski
2015-07-28 17:17 ` Jan Kiszka
2015-07-28 18:22 ` Andy Lutomirski
2015-07-28 19:06 ` Jan Kiszka
2015-07-28 19:24 ` Andy Lutomirski
2015-07-28 19:33 ` Jan Kiszka
2015-07-28 21:16 ` Andy Lutomirski
2015-07-28 22:43 ` Andy Lutomirski
2015-07-28 23:21 ` Benjamin Herrenschmidt
2015-07-28 23:33 ` Andy Lutomirski
2015-07-29 0:36 ` Benjamin Herrenschmidt
2015-07-29 0:47 ` Andy Lutomirski
2015-07-29 0:54 ` Benjamin Herrenschmidt
2015-07-29 8:17 ` Paolo Bonzini
2015-07-29 8:20 ` Jan Kiszka
2015-07-29 9:21 ` Benjamin Herrenschmidt
2015-07-29 8:07 ` Jan Kiszka
2015-07-28 16:36 ` Paolo Bonzini
2015-07-28 16:42 ` Jan Kiszka
2015-07-28 17:15 ` Paolo Bonzini
2015-07-28 17:19 ` Jan Kiszka
2015-07-28 17:31 ` Paolo Bonzini
2015-07-28 13:08 ` Michael S. Tsirkin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140902211023.GB25231@redhat.com \
--to=mst@redhat.com \
--cc=benh@kernel.crashing.org \
--cc=borntraeger@de.ibm.com \
--cc=konrad.wilk@oracle.com \
--cc=linux-s390@vger.kernel.org \
--cc=linux390@de.ibm.com \
--cc=luto@amacapital.net \
--cc=pbonzini@redhat.com \
--cc=virtualization@lists.linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).