kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Knut Omang <knut.omang@oracle.com>
To: Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>,
	David Woodhouse <dwmw2@infradead.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"David S. Miller" <davem@davemloft.net>,
	sparclinux@vger.kernel.org, Joerg Roedel <jroedel@suse.de>,
	Christian Borntraeger <borntraeger@de.ibm.com>,
	Cornelia Huck <cornelia.huck@de.ibm.com>,
	Sebastian Ott <sebott@linux.vnet.ibm.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Christoph Hellwig <hch@lst.de>, KVM <kvm@vger.kernel.org>,
	Martin Schwidefsky <schwidefsky@de.ibm.com>,
	linux-s390 <linux-s390@vger.kernel.org>,
	Linux Virtualization <virtualization@lists.linux-foundation.org>,
	"Michael S. Tsirkin" <mst@redhat.com>
Subject: Re: [PATCH v4 0/6] virtio core DMA API conversion
Date: Tue, 10 Nov 2015 10:45:14 +0100	[thread overview]
Message-ID: <1447148714.3005.133.camel@oracle.com> (raw)
In-Reply-To: <1447121076.31884.61.camel@kernel.crashing.org>

On Tue, 2015-11-10 at 13:04 +1100, Benjamin Herrenschmidt wrote:
> On Mon, 2015-11-09 at 16:46 -0800, Andy Lutomirski wrote:
> > The problem here is that in some of the problematic cases the
> > virtio
> > driver may not even be loaded.  If someone runs an L1 guest with an
> > IOMMU-bypassing virtio device and assigns it to L2 using vfio, then
> > *boom* L1 crashes.  (Same if, say, DPDK gets used, I think.)
> > 
> > > 
> > > The only way out of this while keeping the "platform" stuff would
> > > be to
> > > also bump some kind of version in the virtio config (or PCI
> > > header). I
> > > have no other way to differenciate between "this is an old qemu
> > > that
> > > doesn't do the 'bypass property' yet" from "this is a virtio
> > > device
> > > that doesn't bypass".
> > > 
> > > Any better idea ?
> > 
> > I'd suggest that, in the absence of the new DT binding, we assume
> > that
> > any PCI device with the virtio vendor ID is passthrough on powerpc.
> >   I
> > can do this in the virtio driver, but if it's in the platform code
> > then vfio gets it right too (i.e. fails to load).
> 
> The problem is there isn't *a* virtio vendor ID. It's the RedHat
> vendor
> ID which will be used by more than just virtio, so we need to
> specifically list the devices.
> 
> Additionally, that still means that once we have a virtio device that
> actually uses the iommu, powerpc will not work since the "workaround"
> above will kick in.
> 
> The "in absence of the new DT binding" doesn't make that much sense.
> 
> Those platforms use device-trees defined since the dawn of ages by
> actual open firmware implementations, they either have no iommu
> representation in there (Macs, the platform code hooks it all up) or
> have various properties related to the iommu but no concept of
> "bypass"
> in there.
> 
> We can *add* a new property under some circumstances that indicates a
> bypass on a per-device basis, however that doesn't completely solve
> it:
> 
>   - As I said above, what does the absence of that property mean ? An
> old qemu that does bypass on all virtio or a new qemu trying to tell
> you that the virtio device actually does use the iommu (or some other
> environment that isn't qemu) ?
> 
>   - On things like macs, the device-tree is generated by openbios, it
> would have to have some added logic to try to figure that out, which
> means it needs to know *via different means* that some or all virtio
> devices bypass the iommu.
> 
> I thus go back to my original statement, it's a LOT easier to handle
> if
> the device itself is self describing, indicating whether it is set to
> bypass a host iommu or not. For L1->L2, well, that wouldn't be the
> first time qemu/VFIO plays tricks with the passed through device
> configuration space...
> 
> Note that the above can be solved via some kind of compromise: The
> device self describes the ability to honor the iommu, along with the
> property (or ACPI table entry) that indicates whether or not it does.
> 
> IE. We could use the revision or ProgIf field of the config space for
> example. Or something in virtio config. If it's an "old" device, we
> know it always bypass. If it's a new device, we know it only bypasses
> if the corresponding property is in. I still would have to sort out
> the
> openbios case for mac among others but it's at least a workable
> direction.
> 
> BTW. Don't you have a similar problem on x86 that today qemu claims
> that everything honors the iommu in ACPI ?
> 
> Unless somebody can come up with a better idea...

Can something be done by means of PCIe capabilities?
ATS (Address Translation Support) seems like a natural choice?

Knut

> Cheers,
> Ben.
> 
> --
> To unsubscribe from this list: send the line "unsubscribe sparclinux"
> in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

  parent reply	other threads:[~2015-11-10  9:45 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-10-30  1:09 [PATCH v4 0/6] virtio core DMA API conversion Andy Lutomirski
2015-10-30  1:09 ` [PATCH v4 1/6] virtio-net: Stop doing DMA from the stack Andy Lutomirski
2015-10-30 13:55   ` Christian Borntraeger
2015-10-31  5:02     ` Andy Lutomirski
2015-10-30  1:09 ` [PATCH v4 2/6] virtio_ring: Support DMA APIs Andy Lutomirski
2015-10-30 12:01   ` Cornelia Huck
2015-10-30 12:05     ` Christian Borntraeger
2015-10-30 18:51       ` Andy Lutomirski
2015-10-30  1:09 ` [PATCH v4 3/6] virtio_pci: Use the DMA API Andy Lutomirski
2015-10-30  1:09 ` [PATCH v4 4/6] virtio: Add improved queue allocation API Andy Lutomirski
2015-10-30  1:09 ` [PATCH v4 5/6] virtio_mmio: Use the DMA API Andy Lutomirski
2015-10-30  1:09 ` [PATCH v4 6/6] virtio_pci: " Andy Lutomirski
2015-10-30  1:17 ` [PATCH v4 0/6] virtio core DMA API conversion Andy Lutomirski
2015-10-30  9:57 ` Christian Borntraeger
2015-11-09 12:15 ` Michael S. Tsirkin
2015-11-09 12:27   ` Paolo Bonzini
2015-11-09 22:58   ` Benjamin Herrenschmidt
2015-11-10  0:46     ` Andy Lutomirski
2015-11-10  2:04       ` Benjamin Herrenschmidt
2015-11-10  2:18         ` Andy Lutomirski
2015-11-10  5:26           ` Benjamin Herrenschmidt
2015-11-10  5:33             ` Andy Lutomirski
2015-11-10  5:28           ` Benjamin Herrenschmidt
2015-11-10  5:35             ` Andy Lutomirski
2015-11-10 10:37               ` Benjamin Herrenschmidt
2015-11-10 12:43                 ` Michael S. Tsirkin
2015-11-10 19:37                   ` Benjamin Herrenschmidt
2015-11-10 18:54                 ` Andy Lutomirski
2015-11-10 22:27                   ` Benjamin Herrenschmidt
2015-11-10 23:44                     ` Andy Lutomirski
2015-11-11  0:44                       ` Benjamin Herrenschmidt
2015-11-11  4:46                         ` Andy Lutomirski
2015-11-11  5:08                           ` Benjamin Herrenschmidt
2015-11-10  7:28           ` Jan Kiszka
2015-11-10  9:45         ` Knut Omang [this message]
2015-11-10 10:26           ` Benjamin Herrenschmidt
2015-11-10 10:27         ` Joerg Roedel
2015-11-10 19:36           ` Benjamin Herrenschmidt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1447148714.3005.133.camel@oracle.com \
    --to=knut.omang@oracle.com \
    --cc=benh@kernel.crashing.org \
    --cc=borntraeger@de.ibm.com \
    --cc=cornelia.huck@de.ibm.com \
    --cc=davem@davemloft.net \
    --cc=dwmw2@infradead.org \
    --cc=hch@lst.de \
    --cc=jroedel@suse.de \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=luto@amacapital.net \
    --cc=luto@kernel.org \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=schwidefsky@de.ibm.com \
    --cc=sebott@linux.vnet.ibm.com \
    --cc=sparclinux@vger.kernel.org \
    --cc=virtualization@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).