From: Jean-Philippe.Brucker@arm.com (Jean-Philippe Brucker)
To: linux-arm-kernel@lists.infradead.org
Subject: [RFC PATCH 30/30] vfio: Allow to bind foreign task
Date: Thu, 2 Mar 2017 10:50:39 +0000 [thread overview]
Message-ID: <20170302105038.GA15742@e106794-lin.localdomain> (raw)
In-Reply-To: <AADFC41AFE54684AB9EE6CBC0274A5D190C5018D@SHSMSX101.ccr.corp.intel.com>
On Wed, Mar 01, 2017 at 08:02:09AM +0000, Tian, Kevin wrote:
> > From: Jean-Philippe Brucker [mailto:Jean-Philippe.Brucker at arm.com]
> > Sent: Tuesday, February 28, 2017 11:23 PM
> >
> > Hi Kevin,
> >
> > On Tue, Feb 28, 2017 at 06:43:31AM +0000, Tian, Kevin wrote:
> > > > From: Alex Williamson
> > > > Sent: Tuesday, February 28, 2017 11:54 AM
> > > >
> > > > On Mon, 27 Feb 2017 19:54:41 +0000
> > > > Jean-Philippe Brucker <jean-philippe.brucker@arm.com> wrote:
> > > >
> > > [...]
> > > > > diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
> > > > > index 3fe4197a5ea0..41ae8a231d42 100644
> > > > > --- a/include/uapi/linux/vfio.h
> > > > > +++ b/include/uapi/linux/vfio.h
> > > > > @@ -415,7 +415,9 @@ struct vfio_device_svm {
> > > > > __u32 flags;
> > > > > #define VFIO_SVM_PASID_RELEASE_FLUSHED (1 << 0)
> > > > > #define VFIO_SVM_PASID_RELEASE_CLEAN (1 << 1)
> > > > > +#define VFIO_SVM_PID (1 << 2)
> > > > > __u32 pasid;
> > > > > + __u32 pid;
> > > > > };
> > > > > /*
> > > > > * VFIO_DEVICE_BIND_TASK - _IOWR(VFIO_TYPE, VFIO_BASE + 22,
> > > > > @@ -432,6 +434,19 @@ struct vfio_device_svm {
> > > > > * On success, VFIO writes a Process Address Space ID (PASID) into @pasid. This
> > > > > * ID is unique to a device.
> > > > > *
> > > > > + * VFIO_SVM_PID: bind task @pid instead of current task. The shared address
> > > > > + * space identified by @pasid is that of task identified by @pid.
> > > > > + *
> > > > > + * Given that the caller owns the device, setting this flag grants the
> > > > > + * caller read and write permissions on the entire address space of
> > > > > + * foreign task described by @pid. Therefore, permission to perform the
> > > > > + * bind operation on a foreign process is governed by the ptrace access
> > > > > + * mode PTRACE_MODE_ATTACH_REALCREDS check. See man ptrace(2)
> > for
> > > > more
> > > > > + * information.
> > > > > + *
> > > > > + * If the VFIO_SVM_PID flag is not set, @pid is unused and it is the
> > > > > + * current task that is bound to the device.
> > > > > + *
> > > > > * The bond between device and process must be removed with
> > > > > * VFIO_DEVICE_UNBIND_TASK before exiting.
> > > > > *
> > > >
> > > > BTW, nice commit logs throughout this series, I probably need to read
> > > > through them a few more times to really digest it all. AIUI, the VFIO
> > > > support here is really only useful for basic userspace drivers, I don't
> > > > see how we could take advantage of it for a VM use case where the guest
> > > > manages the PASID space for a domain. Perhaps it hasn't spent enough
> > > > cycles bouncing around in my head yet. Thanks,
> > > >
> > >
> > > Current definition doesn't work with virtualization usage, at least on Intel
> > > VT-d. To enable virtualized SVM within a VM, architecturally VT-d needs
> > > be in a nested mode - go through guest PASID table to find guest CR3,
> > > use guest CR3 as 1st level translation for GVA->GPA and then use 2nd
> > > level translation for GPA->HPA. PASID table is fully allocated/managed
> > > by VM. Within the translation process each guest pointer (PASID or 1st
> > > level paging structures) is treated as GPA which also goes through 2nd
> > > level translation. I didn't read ARM SMMU spec yet, but hope the basic
> > > mechanism stays similar.
> >
> > If I understand correctly, it is very similar on ARM SMMU, where we have
> > two stages of translation. Stage-1 is GVA->GPA and stage-2 is GPA->HPA,
> > with all intermediate tables of stage-1 translation obtained via stage-2
> > as well. The SMMU holds stage-1 paging structure in the PASID tables.
>
> Good to know. :-)
>
> >
> > > Here we need an API which allows Qemu vIOMMU to bind guest PASID
> > > table pointer and enable nested mode for target device in underlying
> > > IOMMU hardware, while proposed API is only for user space driver
> > > regarding to binding a specific host address space.
> > >
> > > Based on above requirement difference, Alex, do you prefer to
> > > introducing one API covering both usages or separate APIs for their
> > > own purposes?
> > >
> > > btw Yi is working on a SVM virtualization prototype based on Intel
> > > VT-d. I hope soon he will send out a RFC so we can align the high
> > > level API requirement better. :-)
> >
> > For IO virtualization on ARM, I'm currently working on a generic
> > para-virtualized IOMMU, where the IOMMU presented to the VM is different
> > from the hardware SMMU (I'll try not to go into details here, to avoid
> > derailing the discussion too much). For virtual SVM, the PASID table
> > format would be different between vIOMMU and pIOMMU, but the page table
> > formats would be the same as the MMU.
>
> When you say 'generic para-virtualized IOMMU', does 'generic' apply
> to ARM only (cross different ARM SMMU versions), or apply to other
> vendors (e.g. Intel, AMD, etc.)? Just want to touch base your high
> level idea here.
It wouldn't apply to ARM only, we're trying to avoid any dependency on
architecture or vendor.
> >
> > The VFIO interface for this would therefore have to be more fine-grained
> > than passing the whole PASID table. And could be implemented by
> > extending the interface proposed here.
> >
> > User passes an opaque architecture-specific structure containing page
> > table format and pgd via the BIND_TASK VFIO ioctl. And the pIOMMU can
> > manage its own PASID tables, pointing to VM page tables. I was thinking
> > of letting the physical IOMMU handle PASID allocation and return it to
> > the VM via BIND_TASK instead of letting the guest do it, but that's more
> > of an implementation detail.
>
> I can see some value of doing this way... anyway not distract this thread.
> Let's discuss detail when you send out that RFC in future thread.
>
> >
> > When talking about SVM virtualization, there also is the case where the
> > VMM wants to avoid pinning all of the guest RAM prior to assigning
> > devices to a VM. In short, stage-2 SVM, where a device fault is handled
> > by KVM to map GPA->HPA. I think the interface presented in this patch
> > could also be reused, but there wouldn't be a lot of overlapping. The
> > PASID wouldn't be used, and we'd need to pass an eventfd or another
> > mechanism that allows KVM or the VMM to handle faults. This makes me
> > more confident that the name "VFIO_IOMMU_SVM_BIND" might be more
> > suitable than "VFIO_IOMMU_BIND_TASK".
>
> yes SVM_BIND sounds more general.
>
> >
> > To summarize, I think that this API can be reused when implementing a
> > para-virtualized IOMMU. But for the "full" virtualization case, a
> > somewhat orthogonal API would be needed. The fault reporting
> > infrastructure would most likely be common. So I don't think that this
> > proposal will collide with the SVM virtualization work for VT-d.
> >
>
> Thanks for sharing your thought. Even for 'full' virtualization e.g. in
> our case, we may also reuse the same API if you would like to go
> with new name, which is generic enough to cover all potential usages
> with sub-ops defined to differentiate (bind to host process, bind to
> guest process, bind to guest PASID table, etc).
Yes I am keen on using a common API, so I'm looking forward to your
SVM virtualization RFC as well.
Thanks,
Jean-Philippe
next prev parent reply other threads:[~2017-03-02 10:50 UTC|newest]
Thread overview: 103+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-02-27 19:54 [RFC PATCH 00/30] Add PCIe SVM support to ARM SMMUv3 Jean-Philippe Brucker
2017-02-27 19:54 ` [RFC PATCH 01/30] iommu/arm-smmu-v3: Link groups and devices Jean-Philippe Brucker
2017-03-27 12:18 ` Robin Murphy
2017-04-10 11:02 ` Jean-Philippe Brucker
2017-02-27 19:54 ` [RFC PATCH 02/30] iommu/arm-smmu-v3: Link groups and domains Jean-Philippe Brucker
2017-02-27 19:54 ` [RFC PATCH 03/30] PCI: Move ATS declarations outside of CONFIG_PCI Jean-Philippe Brucker
2017-03-03 21:09 ` Bjorn Helgaas
2017-03-06 11:29 ` Jean-Philippe Brucker
2017-02-27 19:54 ` [RFC PATCH 04/30] iommu/arm-smmu-v3: Add support for PCI ATS Jean-Philippe Brucker
2017-03-01 19:24 ` Sinan Kaya
2017-03-02 10:51 ` Jean-Philippe Brucker
2017-03-02 13:11 ` okaya at codeaurora.org
2017-03-08 15:26 ` Sinan Kaya
2017-03-21 19:38 ` Jean-Philippe Brucker
2017-04-03 8:34 ` Sunil Kovvuri
2017-04-03 10:14 ` Jean-Philippe Brucker
2017-04-03 11:42 ` Sunil Kovvuri
2017-04-03 11:56 ` Jean-Philippe Brucker
2017-05-10 12:54 ` Tomasz Nowicki
2017-05-10 13:35 ` Jean-Philippe Brucker
2017-05-23 8:41 ` Leizhen (ThunderTown)
2017-05-23 11:21 ` Jean-Philippe Brucker
2017-05-25 18:27 ` Roy Franz (Cavium)
2017-02-27 19:54 ` [RFC PATCH 05/30] iommu/arm-smmu-v3: Disable tagged pointers when ATS is in use Jean-Philippe Brucker
2017-05-22 6:27 ` Leizhen (ThunderTown)
2017-05-22 14:02 ` Jean-Philippe Brucker
2017-02-27 19:54 ` [RFC PATCH 06/30] iommu/arm-smmu-v3: Add support for Substream IDs Jean-Philippe Brucker
2017-02-27 19:54 ` [RFC PATCH 07/30] iommu/arm-smmu-v3: Add second level of context descriptor table Jean-Philippe Brucker
2017-05-15 12:47 ` Tomasz Nowicki
2017-05-15 13:57 ` Jean-Philippe Brucker
2017-02-27 19:54 ` [RFC PATCH 08/30] iommu/arm-smmu-v3: Add support for VHE Jean-Philippe Brucker
2017-02-27 19:54 ` [RFC PATCH 09/30] iommu/arm-smmu-v3: Support broadcast TLB maintenance Jean-Philippe Brucker
2017-02-27 19:54 ` [RFC PATCH 10/30] iommu/arm-smmu-v3: Add task contexts Jean-Philippe Brucker
2017-02-27 19:54 ` [RFC PATCH 11/30] arm64: mm: Pin down ASIDs for sharing contexts with devices Jean-Philippe Brucker
2017-02-27 19:54 ` [RFC PATCH 12/30] iommu/arm-smmu-v3: Keep track of process address spaces Jean-Philippe Brucker
2017-02-27 19:54 ` [RFC PATCH 13/30] iommu/io-pgtable-arm: Factor out ARM LPAE register defines Jean-Philippe Brucker
2017-02-27 19:54 ` [RFC PATCH 14/30] iommu/arm-smmu-v3: Share process page tables Jean-Philippe Brucker
2017-02-27 19:54 ` [RFC PATCH 15/30] iommu/arm-smmu-v3: Steal private ASID from a domain Jean-Philippe Brucker
2017-02-27 19:54 ` [RFC PATCH 16/30] iommu/arm-smmu-v3: Use shared ASID set Jean-Philippe Brucker
2017-02-27 19:54 ` [RFC PATCH 17/30] iommu/arm-smmu-v3: Add SVM feature checking Jean-Philippe Brucker
2017-02-27 19:54 ` [RFC PATCH 18/30] PCI: Make "PRG Response PASID Required" handling common Jean-Philippe Brucker
2017-03-03 21:11 ` Bjorn Helgaas
2017-03-06 11:31 ` Jean-Philippe Brucker
2017-02-27 19:54 ` [RFC PATCH 19/30] PCI: Cache PRI and PASID bits in pci_dev Jean-Philippe Brucker
2017-03-03 21:12 ` Bjorn Helgaas
2017-02-27 19:54 ` [RFC PATCH 20/30] iommu/arm-smmu-v3: Enable PCI PASID in masters Jean-Philippe Brucker
2017-05-31 14:10 ` [RFC,20/30] " Sinan Kaya
2017-06-01 12:30 ` Jean-Philippe Brucker
2017-06-01 12:30 ` David Woodhouse
2017-06-23 14:39 ` Sinan Kaya
2017-06-23 15:15 ` Jean-Philippe Brucker
2017-02-27 19:54 ` [RFC PATCH 21/30] iommu/arm-smmu-v3: Handle device faults from PRI Jean-Philippe Brucker
[not found] ` <8520D5D51A55D047800579B0941471982640F43C@XAP-PVEXMBX02.xlnx.xilinx.com>
2017-03-25 5:16 ` valmiki
2017-03-27 11:05 ` Jean-Philippe Brucker
2017-02-27 19:54 ` [RFC PATCH 22/30] iommu: Bind/unbind tasks to/from devices Jean-Philippe Brucker
2017-03-02 7:29 ` Tian, Kevin
2017-03-03 9:40 ` David Woodhouse
2017-03-03 17:05 ` Raj, Ashok
2017-03-03 18:39 ` Jean-Philippe Brucker
2017-03-22 15:36 ` Joerg Roedel
2017-03-22 18:30 ` Jean-Philippe Brucker
2017-03-22 15:38 ` Joerg Roedel
2017-02-27 19:54 ` [RFC PATCH 23/30] iommu/arm-smmu-v3: Bind/unbind device and task Jean-Philippe Brucker
2017-02-27 19:54 ` [RFC PATCH 24/30] iommu: Specify PASID state when unbinding a task Jean-Philippe Brucker
2017-03-22 15:44 ` Joerg Roedel
2017-03-22 18:31 ` Jean-Philippe Brucker
2017-03-22 22:53 ` Joerg Roedel
2017-03-23 13:37 ` Jean-Philippe Brucker
2017-03-23 14:30 ` Joerg Roedel
2017-03-23 15:52 ` Jean-Philippe Brucker
2017-03-23 16:52 ` Joerg Roedel
2017-03-23 17:03 ` Jean-Philippe Brucker
2017-03-24 11:00 ` Joerg Roedel
2017-03-24 19:08 ` Jean-Philippe Brucker
2017-03-27 15:33 ` Joerg Roedel
2017-02-27 19:54 ` [RFC PATCH 25/30] iommu/arm-smmu-v3: Safe invalidation and recycling of PASIDs Jean-Philippe Brucker
2017-02-27 19:54 ` [RFC PATCH 26/30] iommu/arm-smmu-v3: Fix PRI queue overflow acknowledgement Jean-Philippe Brucker
2017-02-27 19:54 ` [RFC PATCH 27/30] iommu/arm-smmu-v3: Handle PRI queue overflow Jean-Philippe Brucker
2017-02-27 19:54 ` [RFC PATCH 28/30] iommu/arm-smmu-v3: Add support for Hardware Translation Table Update at stage 1 Jean-Philippe Brucker
2017-02-27 19:54 ` [RFC PATCH 29/30] vfio: Add support for Shared Virtual Memory Jean-Philippe Brucker
2017-02-28 3:54 ` Alex Williamson
2017-02-28 15:17 ` Jean-Philippe Brucker
2017-03-21 7:04 ` Liu, Yi L
2017-03-21 19:37 ` Jean-Philippe Brucker
2017-03-21 20:56 ` jacob pan
2017-03-23 8:39 ` Liu, Yi L
2017-03-23 13:38 ` Jean-Philippe Brucker
2017-03-24 7:46 ` Liu, Yi L
2017-03-27 10:13 ` Jean-Philippe Brucker
2017-03-29 6:17 ` Liu, Yi L
2017-04-26 6:53 ` Tomasz Nowicki
2017-04-26 10:08 ` Jean-Philippe Brucker
2017-04-26 11:01 ` Tomasz Nowicki
2017-02-27 19:54 ` [RFC PATCH 30/30] vfio: Allow to bind foreign task Jean-Philippe Brucker
2017-02-28 3:54 ` Alex Williamson
2017-02-28 6:43 ` Tian, Kevin
2017-02-28 15:22 ` Jean-Philippe Brucker
2017-03-01 8:02 ` Tian, Kevin
2017-03-02 10:50 ` Jean-Philippe Brucker [this message]
2017-04-26 7:25 ` Tomasz Nowicki
2017-04-26 10:08 ` Jean-Philippe Brucker
2017-03-06 8:20 ` [RFC PATCH 00/30] Add PCIe SVM support to ARM SMMUv3 Liu, Yi L
2017-03-06 11:14 ` Jean-Philippe Brucker
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170302105038.GA15742@e106794-lin.localdomain \
--to=jean-philippe.brucker@arm.com \
--cc=linux-arm-kernel@lists.infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox