All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: David Woodhouse <dwmw2@infradead.org>
Cc: Wei Liu <wei.liu2@citrix.com>,
	qemu-devel@nongnu.org, linux-kernel@vger.kernel.org,
	pbonzini@redhat.com, peterx@redhat.com, cornelia.huck@de.ibm.com,
	Stefan Hajnoczi <stefanha@redhat.com>,
	Kevin Wolf <kwolf@redhat.com>, Amit Shah <amit.shah@redhat.com>,
	qemu-block@nongnu.org, Jason Wang <jasowang@redhat.com>,
	Alex Williamson <alex.williamson@redhat.com>,
	Andy Lutomirski <luto@kernel.org>,
	Christian Borntraeger <borntraeger@de.ibm.com>,
	virtualization@lists.linux-foundation.org, kvm@vger.kernel.org,
	Stefano Stabellini <stefano.stabellini@eu.citrix.com>,
	Anthony PERARD <anthony.perard@citrix.com>,
	iommu@lists.linux-foundation.org
Subject: Re: [PATCH V2 RFC] fixup! virtio: convert to use DMA api
Date: Wed, 27 Apr 2016 16:37:04 +0300	[thread overview]
Message-ID: <20160427153345-mutt-send-email-mst@redhat.com> (raw)
In-Reply-To: <1461759501.118304.149.camel@infradead.org>

On Wed, Apr 27, 2016 at 01:18:21PM +0100, David Woodhouse wrote:
> 
> > > On some systems, including Xen and any system with a physical device
> > > that speaks virtio behind a physical IOMMU, we must use the DMA API
> > > for virtio DMA to work at all.
> > > 
> > > Add a feature bit to detect that: VIRTIO_F_IOMMU_PLATFORM.
> > > 
> > > If not there, we preserve historic behavior and bypass the DMA
> > > API unless within Xen guest. This is actually required for
> > > systems, including SPARC and PPC64, where virtio-pci devices are
> > > enumerated as though they are behind an IOMMU, but the virtio host
> > > ignores the IOMMU, so we must either pretend that the IOMMU isn't
> > > there or somehow map everything as the identity.
> > > 
> > > Re: non-virtio devices.
> > > 
> > > It turns out that on old QEMU hosts, only emulated devices which were
> > > part of QEMU use the IOMMU.  Should we want to bypass the IOMMU for such
> > > devices *only*, it would be rather easy to detect them by looking at
> > > subsystem vendor and device ID. Thus, no new interfaces are required
> > > except for virtio which always uses the same subsystem vendor and device ID.
> 
> Apologies for dropping this thread; I've been travelling.
> 
> But seriously, NO!
> 
> I understand why you want to see this as a virtio-specific issue, but
> it isn't. And we don't *want* it to be.
> 
> In the guest, drivers SHALL use the DMA API. And the DMA API SHALL do
> the right thing for each device according to its needs.
> 
> So any information passed from qemu to the guest should be directed at
> the platform IOMMU code (or handled by qemu-detection quirks in the
> guest, if we must).
> 
> It is *not* acceptable for the virtio drivers in the guest to just
> eschew the DMA API completely, triggered by some device-specific flag.
> 
> The qemu implementation is, of course, monolithic. In qemu the fact
> that virtio doesn't get translated by the emulated IOMMU *is* actually
> down to code in the virtio implementation. I get that.
> 
> But then again, it's not just virtio. *Any* device which we emulate for
> the guest could have that same issue, and appear as untranslated. (And
> assigned PCI devices currently do).
> 
> Let's think about the parallel with a system-on-chip. Let's say we have
> a peripheral which got included, but which was wired up such that it
> bypasses the IOMMU and gets to do direct physical DMA. Is that a
> feature of that specific peripheral? Do we hack its drivers to make the
> distinction between this incarnation, and a normal discrete version of
> the same device? No! It's a feature of the *system*

One correction: it's a feature of the device in the system.
There could be a mix of devices bypassing and not
bypassing the IOMMU.

> and needs to be
> conveyed to the OS IOMMU code to do the right thing. Not to the driver.
> 
> In my opinion, adding the VIRTIO_F_IOMMU_PLATFORM feature bit is
> absolutely the wrong thing to do.
> 
> What we *should* do is a patchset in the guest which both fixes virtio
> drivers to *always* use the DMA API, and fixes the DMA API to DTRT at
> the same time — by detecting qemu and installing no-op DMA ops for the
> appropriate devices, perhaps.

Sounds good. And a way to detect appropriate devices could
be by looking at the feature flag, perhaps?


> Then we can look at giving qemu a way to properly indicate which
> devices it actually does DMA mapping for, so we can remove those
> heuristic assumptions.
> 
> But that flag does *not* live in the virtio host←→guest ABI.
> 
> -- 
> David Woodhouse                            Open Source Technology Centre
> David.Woodhouse@intel.com                              Intel Corporation
> 

WARNING: multiple messages have this Message-ID (diff)
From: "Michael S. Tsirkin" <mst@redhat.com>
To: David Woodhouse <dwmw2@infradead.org>
Cc: Wei Liu <wei.liu2@citrix.com>,
	qemu-devel@nongnu.org, linux-kernel@vger.kernel.org,
	pbonzini@redhat.com, peterx@redhat.com, cornelia.huck@de.ibm.com,
	Stefan Hajnoczi <stefanha@redhat.com>,
	Kevin Wolf <kwolf@redhat.com>, Amit Shah <amit.shah@redhat.com>,
	qemu-block@nongnu.org, Jason Wang <jasowang@redhat.com>,
	Alex Williamson <alex.williamson@redhat.com>,
	Andy Lutomirski <luto@kernel.org>,
	Christian Borntraeger <borntraeger@de.ibm.com>,
	virtualization@lists.linux-foundation.org, kvm@vger.kernel.org,
	Stefano Stabellini <stefano.stabellini@eu.citrix.com>,
	Anthony PERARD <anthony.perard@citrix.com>,
	iommu@lists.linux-foundation.org
Subject: Re: [Qemu-devel] [PATCH V2 RFC] fixup! virtio: convert to use DMA api
Date: Wed, 27 Apr 2016 16:37:04 +0300	[thread overview]
Message-ID: <20160427153345-mutt-send-email-mst@redhat.com> (raw)
In-Reply-To: <1461759501.118304.149.camel@infradead.org>

On Wed, Apr 27, 2016 at 01:18:21PM +0100, David Woodhouse wrote:
> 
> > > On some systems, including Xen and any system with a physical device
> > > that speaks virtio behind a physical IOMMU, we must use the DMA API
> > > for virtio DMA to work at all.
> > > 
> > > Add a feature bit to detect that: VIRTIO_F_IOMMU_PLATFORM.
> > > 
> > > If not there, we preserve historic behavior and bypass the DMA
> > > API unless within Xen guest. This is actually required for
> > > systems, including SPARC and PPC64, where virtio-pci devices are
> > > enumerated as though they are behind an IOMMU, but the virtio host
> > > ignores the IOMMU, so we must either pretend that the IOMMU isn't
> > > there or somehow map everything as the identity.
> > > 
> > > Re: non-virtio devices.
> > > 
> > > It turns out that on old QEMU hosts, only emulated devices which were
> > > part of QEMU use the IOMMU.  Should we want to bypass the IOMMU for such
> > > devices *only*, it would be rather easy to detect them by looking at
> > > subsystem vendor and device ID. Thus, no new interfaces are required
> > > except for virtio which always uses the same subsystem vendor and device ID.
> 
> Apologies for dropping this thread; I've been travelling.
> 
> But seriously, NO!
> 
> I understand why you want to see this as a virtio-specific issue, but
> it isn't. And we don't *want* it to be.
> 
> In the guest, drivers SHALL use the DMA API. And the DMA API SHALL do
> the right thing for each device according to its needs.
> 
> So any information passed from qemu to the guest should be directed at
> the platform IOMMU code (or handled by qemu-detection quirks in the
> guest, if we must).
> 
> It is *not* acceptable for the virtio drivers in the guest to just
> eschew the DMA API completely, triggered by some device-specific flag.
> 
> The qemu implementation is, of course, monolithic. In qemu the fact
> that virtio doesn't get translated by the emulated IOMMU *is* actually
> down to code in the virtio implementation. I get that.
> 
> But then again, it's not just virtio. *Any* device which we emulate for
> the guest could have that same issue, and appear as untranslated. (And
> assigned PCI devices currently do).
> 
> Let's think about the parallel with a system-on-chip. Let's say we have
> a peripheral which got included, but which was wired up such that it
> bypasses the IOMMU and gets to do direct physical DMA. Is that a
> feature of that specific peripheral? Do we hack its drivers to make the
> distinction between this incarnation, and a normal discrete version of
> the same device? No! It's a feature of the *system*

One correction: it's a feature of the device in the system.
There could be a mix of devices bypassing and not
bypassing the IOMMU.

> and needs to be
> conveyed to the OS IOMMU code to do the right thing. Not to the driver.
> 
> In my opinion, adding the VIRTIO_F_IOMMU_PLATFORM feature bit is
> absolutely the wrong thing to do.
> 
> What we *should* do is a patchset in the guest which both fixes virtio
> drivers to *always* use the DMA API, and fixes the DMA API to DTRT at
> the same time — by detecting qemu and installing no-op DMA ops for the
> appropriate devices, perhaps.

Sounds good. And a way to detect appropriate devices could
be by looking at the feature flag, perhaps?


> Then we can look at giving qemu a way to properly indicate which
> devices it actually does DMA mapping for, so we can remove those
> heuristic assumptions.
> 
> But that flag does *not* live in the virtio host←→guest ABI.
> 
> -- 
> David Woodhouse                            Open Source Technology Centre
> David.Woodhouse@intel.com                              Intel Corporation
> 

  parent reply	other threads:[~2016-04-27 13:37 UTC|newest]

Thread overview: 98+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-21 13:43 [PATCH V2 RFC] fixup! virtio: convert to use DMA api Michael S. Tsirkin
2016-04-21 13:43 ` [Qemu-devel] " Michael S. Tsirkin
2016-04-21 13:43 ` Michael S. Tsirkin
2016-04-21 13:54 ` Wei Liu
2016-04-21 13:54   ` [Qemu-devel] " Wei Liu
2016-04-21 13:54   ` Wei Liu
2016-04-27 12:18   ` David Woodhouse
2016-04-27 12:18     ` [Qemu-devel] " David Woodhouse
2016-04-27 13:37     ` Michael S. Tsirkin
2016-04-27 13:37     ` Michael S. Tsirkin [this message]
2016-04-27 13:37       ` [Qemu-devel] " Michael S. Tsirkin
2016-04-27 14:23       ` Joerg Roedel
     [not found]       ` <20160427153345-mutt-send-email-mst-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-04-27 14:23         ` Joerg Roedel
2016-04-27 14:23           ` [Qemu-devel] " Joerg Roedel
2016-04-27 14:23           ` Joerg Roedel
2016-04-27 14:31           ` Andy Lutomirski
2016-04-27 14:31             ` [Qemu-devel] " Andy Lutomirski
2016-04-27 14:31             ` Andy Lutomirski
2016-04-27 14:38             ` Michael S. Tsirkin
     [not found]             ` <CALCETrVkSSJbjoK8i7pLsSYR0o=Wy1UP-mrmn2uxYUd81g18dg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-04-27 14:38               ` Michael S. Tsirkin
2016-04-27 14:38                 ` [Qemu-devel] " Michael S. Tsirkin
2016-04-27 14:38                 ` Michael S. Tsirkin
2016-04-27 14:43                 ` Andy Lutomirski
2016-04-27 14:43                 ` Andy Lutomirski
2016-04-27 14:43                   ` [Qemu-devel] " Andy Lutomirski
2016-04-27 14:43                   ` Andy Lutomirski
2016-04-27 14:54                   ` Michael S. Tsirkin
2016-04-27 14:54                     ` [Qemu-devel] " Michael S. Tsirkin
2016-04-27 14:54                     ` Michael S. Tsirkin
2016-04-27 14:58                     ` Joerg Roedel
2016-04-27 14:58                     ` Joerg Roedel
2016-04-27 14:58                       ` [Qemu-devel] " Joerg Roedel
2016-04-27 15:09                       ` Michael S. Tsirkin
2016-04-27 15:09                         ` [Qemu-devel] " Michael S. Tsirkin
2016-04-27 15:09                         ` Michael S. Tsirkin
2016-04-27 15:09                       ` Michael S. Tsirkin
2016-04-27 15:10                     ` Andy Lutomirski
2016-04-27 15:10                     ` Andy Lutomirski
2016-04-27 15:10                       ` [Qemu-devel] " Andy Lutomirski
2016-04-27 15:10                       ` Andy Lutomirski
2016-04-27 14:54                   ` Michael S. Tsirkin
2016-04-27 14:34           ` Michael S. Tsirkin
2016-04-27 14:34             ` [Qemu-devel] " Michael S. Tsirkin
2016-04-27 14:34             ` Michael S. Tsirkin
2016-04-27 14:56             ` Joerg Roedel
     [not found]             ` <20160427172630-mutt-send-email-mst-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-04-27 14:56               ` Joerg Roedel
2016-04-27 14:56                 ` [Qemu-devel] " Joerg Roedel
2016-04-27 14:56                 ` Joerg Roedel
     [not found]                 ` <20160427145632.GI17926-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>
2016-04-27 15:05                   ` Michael S. Tsirkin
2016-04-27 15:05                     ` [Qemu-devel] " Michael S. Tsirkin
2016-04-27 15:05                     ` Michael S. Tsirkin
2016-04-27 15:15                     ` David Woodhouse
2016-04-27 15:15                       ` [Qemu-devel] " David Woodhouse
2016-04-27 15:15                       ` David Woodhouse
2016-04-27 18:17                       ` Michael S. Tsirkin
2016-04-27 18:17                         ` [Qemu-devel] " Michael S. Tsirkin
2016-04-27 18:17                         ` Michael S. Tsirkin
2016-04-27 19:16                         ` David Woodhouse
2016-04-27 19:16                           ` [Qemu-devel] " David Woodhouse
2016-04-27 19:16                           ` David Woodhouse
     [not found]                           ` <1461784617.118304.181.camel-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
2016-04-28 14:34                             ` Michael S. Tsirkin
2016-04-28 14:34                               ` [Qemu-devel] " Michael S. Tsirkin
2016-04-28 14:34                               ` Michael S. Tsirkin
2016-04-28 15:11                               ` David Woodhouse
2016-04-28 15:11                                 ` [Qemu-devel] " David Woodhouse
2016-04-28 15:11                                 ` David Woodhouse
2016-04-28 15:37                                 ` Michael S. Tsirkin
     [not found]                                 ` <1461856314.33870.98.camel-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
2016-04-28 15:37                                   ` Michael S. Tsirkin
2016-04-28 15:37                                     ` [Qemu-devel] " Michael S. Tsirkin
2016-04-28 15:37                                     ` Michael S. Tsirkin
2016-04-28 15:48                                     ` David Woodhouse
2016-04-28 15:48                                     ` David Woodhouse
2016-04-28 15:48                                       ` [Qemu-devel] " David Woodhouse
2016-04-28 15:48                                       ` David Woodhouse
2016-05-01 10:37                                       ` Michael S. Tsirkin
2016-05-01 10:37                                         ` [Qemu-devel] " Michael S. Tsirkin
2016-05-01 10:37                                         ` Michael S. Tsirkin
     [not found]                                     ` <20160428182341-mutt-send-email-mst-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-05-09 11:09                                       ` Paolo Bonzini
2016-05-09 11:09                                         ` [Qemu-devel] " Paolo Bonzini
2016-05-09 11:09                                         ` Paolo Bonzini
2016-05-09 11:09                                     ` Paolo Bonzini
2016-04-28 14:34                           ` Michael S. Tsirkin
2016-04-27 19:16                         ` David Woodhouse
2016-04-27 18:17                       ` Michael S. Tsirkin
2016-04-27 15:15                     ` David Woodhouse
2016-04-27 15:05                 ` Michael S. Tsirkin
2016-04-27 12:18   ` David Woodhouse
2016-04-21 13:54 ` Wei Liu
2016-04-21 14:56 ` Stefan Hajnoczi
2016-04-21 14:56 ` Stefan Hajnoczi
2016-04-21 14:56   ` [Qemu-devel] " Stefan Hajnoczi
2016-04-21 15:11   ` Michael S. Tsirkin
2016-04-21 15:11     ` [Qemu-devel] " Michael S. Tsirkin
2016-04-21 15:11     ` Michael S. Tsirkin
2016-04-22  9:33     ` Stefan Hajnoczi
2016-04-22  9:33     ` Stefan Hajnoczi
2016-04-22  9:33       ` [Qemu-devel] " Stefan Hajnoczi
2016-04-22  9:33       ` Stefan Hajnoczi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160427153345-mutt-send-email-mst@redhat.com \
    --to=mst@redhat.com \
    --cc=alex.williamson@redhat.com \
    --cc=amit.shah@redhat.com \
    --cc=anthony.perard@citrix.com \
    --cc=borntraeger@de.ibm.com \
    --cc=cornelia.huck@de.ibm.com \
    --cc=dwmw2@infradead.org \
    --cc=iommu@lists.linux-foundation.org \
    --cc=jasowang@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=kwolf@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=peterx@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    --cc=stefano.stabellini@eu.citrix.com \
    --cc=virtualization@lists.linux-foundation.org \
    --cc=wei.liu2@citrix.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.